DOCUMENT RESUME 



ED 213 761 

AUTHOR - 
TITLE. 
INSTITUTION 
SPONS AGENCY 

REPORT NO 
PUB DATE 
CONTRACT 
NOTE 

EDRS PRICE i 
DESCRIPTORS 



IDENTIFIERS 



TM 820 203 

Gialluca, Kathleen A.; Weiss, David J. 
Dimensionality of Measured Achievement Over Time. 
Minnesota Univ., Minneapolis. Dept. of Psychology.' 
Office of Naval Research, Arlington, Va. Personnel 
and Training Research Programs Office. • 
ONR-^RR-81-5 
Dec<^81 

N00014-79-C-0172' 
46p. 

* ' 
MF01/PC02 Plus Postage. 

♦Academic Achievement; *Achievemetft Gains; ■ 
Achievement Tests; *Biology; *College Mathematics; 
Factor Structure; Higher Education; *Measurement 
*Change Scores; Difference Scores; Dimensional 
Analysis; *Measurement of Change 



ABSTRACT . 

x k ' Some type of difference or change score is frequently 
used to quantify the effects of experimental treatments and 
educational programs on individuals and on groups of individuals Two 
studies investigated the tenability of the assumption- that classroom 
instruction results in increases in students"' achievement levels 
while the qualitative nature of that achievement remains constant 
across time. The data utilized were -the item responses to tests in 
basic mathematics and in general biology administered -as pretests and 
after instruction to students enrolled in those courses. 'Results 
indicated that this assumption was not tenable in the biology data 
set, where increases ,in mean achievement level were accompanied by 
corresponding cR&nges in the factor structure underlying the item 
responses. F6r the matheAatfcs data,- however, there, was no such 
violation of the assumption; as student achievement levels increased 
the underlying factor „ structure remained unchanged. The implications 
of these results fo* psychology, education, and program evaluation 
are noted, (Author/GK) 



* Reproductions Supplied by' EDRS are" the' best that can be made 

* . from the original document. 



imensionality of Measured 
Achievement Over Time 



athleen A. Gialluca 
nd 

avid J. Weiss 



US. DCf ARTMENT OF EDUCATION 
NATIONAL INSTITUTE OF EDUCATION 
EDUCATIONAL RESOURCES INFORMATION 

CENTER (ERIC) 
I^ThB document has been raproduced as 
received from the person or organization 
ongtoating »t , . 

□ Minor changes have been made to improve 
reproduction quality. 

• Points of view or opinions stated in this docu- 
ment do not necessarily represent official NIE 
position or policy 



V 

"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN.GRANTED BY 

. 4U OfejCt, o{ r 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER {ERIC)/' 



V 



y 



Research. Report 81-5 
December 1981 



V 



Computerized Adaptive Testing Laboratory 
Psychometric Methods -Program. 
Department of Psychology 
University of Minnesota 
" J Minneapolis, MN 55455 

s 

This research was supported by ? funds, from the Air 
Force Human Resources Laboratory, the Army Research 
Institute, the AiffiForcie Office of Scientific 
Research, and^lhe Office of Naval Research, 
* and monitored by tfi£ Office of Naval -Research. 

Approved for public release; distribution unlimited. 
Reproduction in wholW.br in^part is permitted for 
any purpose of the United States, Government. 



Unclassified / 



SECURITY CLASSIFICATION OF THIS PAGE (When Dete Entered) 



REPORT DOCUMENTATION PAGE 


READ INSTRUCTIONS 
BEFORE COMPLETING FO&M 


1. REPORT NUMBER 2. GOVT ACCESSION NO. 

♦ 

Research Report 81-5 . ' /T 

- i ; : 


3. RECIPIENT'S CAT ALOG NUMBER 


4, TITLE (end Subtitle) * 

Dimensionality of 'Measured Achievemerrt 
Over Time 


r 


9. TYPE OF REPORT ft PERIOD COVERED 

> 

Technical ' Report ^ 


6. PERFORMING ORG. REPORT NUMBER 


7. AUTHOR^ 

Kathleen A. Gial^uca and David^ J. Weiss 


8. CONTRACT OR GRANT NUMBER^; 

N00014-79-C-0172 ' 

1. . 


9. PERFORMING ORGANIZATION NAME AND ADDRESS 

Department of Psychology 
University of Minnesota 
Minneapolis, Minnesota 55455, 




10. PROGRAM ELEMENT. PROJECT, TASK 
^ AREA 8 WORK UNIT NUMBERS 

P.E.: 6115N Proj.: RR042-0A 
T.A. : RR042-04-01 


11 s CONTROLLING OFFICE NAME AND ADDRESS 

t Personnel and Training Research Programs 
Office^ of Naval Research 
Arlington, Virginia 22217 


12. 'REPORT DATE 

December, 1981 


13 UIIUBTD C\ C DirCC 

1 NUMOtn Ur rAut!) 

34 


U MONITORING AGENCY NAME & ADDRESS^// different from Controlling Office) 


15. SECURITY CLASS, (of thle report) 


,1Sr DECL ASSI FIC ATI ON/ DOWNGRADING 
SCHEDULE 


16. DISTRIBUTION STATEMENT (ot thle Report) » 

Approved for public release ; "distributio 
whole or in part is permitted for any pti 
ment'.fe •*/ 

V * 

4 * . 
1 , 


n unlimited, ■ Reproduction in f 
rpbse of the United States Govern- 


J7. Ol'STRIBUTION STATEMENT (of the ebetrect entered In Block 20, If different from Report) 

♦ 


18. SUPPLEMENTARY NOTES 

This research was supported, by /fund's fro 
Laboratory, the Army Research Institute, 
Research, and the Office of^Naval Resear 
, of' Naval Research 1 / 


n fcfie Air Force Human Resources 
wxe Air Force Sffice of Scientific 
cH^ and monitored by the Office 

/ ' i ^ 


19. KEY.WOROS (Continue on reveree elde It neceeety md Identity i 

achievement testing factor conv 
change scores x factor str 
measurement of .change 
measurement* of growth 


tyj jbiock number) 

pjarisfons 

lieture 0 


20. ABSTRACT (Continue on reveree elde K neceeeery end identity b 

Some. type of difference or change score J 
effects of experimental treatments *and ejc 
on groups /of individuals. Whether the or 
simple difference, scores, their derivator 
cal design, the measurement process itsa] 
struction< results in higher levels of the 
that the* only change that occurs is a qlia 


y btoek number) 

s frequently used to quantify the 
ucational programs on individuals and 
lange measurement jtnvolves the use of 
r es, or some more complex methodo'logi- 
:f assumes that the treatment or in- 
i originally measured variable and 
ntitative- one. ' If this Assumption *is 



DD 



9 



FORM 
1 JAN 73 



1473 ' EDITION X>F 1 NOV 65 (S OBSOLETE 

S/N Olof.LF-014-6601 



Unclassified 



SECURITY CLASSIFICATION OF THISuPAOE (When Dete Entered) 



A 



L 



Unclassified 



SECURITY CLASSIFICATION OF ThIs PAGE (WHm Data *i\ff4) 



not met, then the computation of any type of difference score is inappropriate 
and the scores, themselves are useless for measuring growth or change, 

%^ \ ^~ 

Two studies investigate^ the tenability of the assumption tKat classroom in- 
struction results *in increases in students' achievemerut l^yelg- while the qual- 
itative nature of that achievement remains constant across time* The data 
utilized were the item responses to tests in basic Mathematics and in general 
biology administered as* pretests and after instruction to students ^enrolled in 
those jcourses . ' A - 

Results; indicated that this as€umptibn was not tenable in the biology -data ' 
se^:, wh^re increases in mea;a achievement level were .accompanied by correspond- 
ing changes in the factor structure underlying the item responses.. For the 
m^thema'ttfcs data, however, there was no such violation of the assumption: As 
student achievement levels increased the underlying factor structure remained 
unchanged* -The implications of these results for psychology, education, and 
program evaluation ^re noted. s ' 



it- 

• - - ~~ \ . . . * • • . t . ' / 

% 

/ . > • 

A , . ' 




Unclassified 



^ECURITY CLASSIFICATION OF THIS PAGEfW$»n D«f« *nl»r»d; 



r 



Contents 



Introduction^. 
Purpose . 



Study I 



. 1 

: 2 



Method 

Subjects and Tests 
Analyses 



Results 



3 
3 
3 

Differences in Achievement Level Estim^tes^'. « . ." ( . . . .Y. 3 

Differences in the Structure of Achievement y \. ...\ 3 



Differences in Achievement Level Estimates * v ^sXv/- 4 

Total Score Differences I • «V-"- 4 

r Item Difficulties *. ' * J.. 4 

Correlation Between Scores * ./. 6 

Differences in the .Structure of Achievement 6 

Internal Consistency Reliability T\.... 6 

Number of ^Factors Extracted'. 6 

Factor Similarity I .... - 6 

Conclusions \ 9 

Differences in Achievement Level Estimate? 7 \ 9 

bi fferences in the Structure of Achievement . 9 



Study II * fc. 

Method .* 

Subjects • 

Design • 



Y 



Test^^. 

Expe^Kental Groups 



Analyses * .' 

Differences in'Achievement Level Estimates: Test A .*. , . 

differences in the Structure of Achievement: Test A 

?ifferen'ces in Achievement Level Estimates: Test B .' .....X 

'. • Differences in the Structure of~AchievemenU Test B .". . 

'Results , v .... 

Effects of Item Repetition . T ; ■ 

Missing Data V , ^ 

a Differences in Achievement L^vel Estimates: - Test A>„. .«."/» 

Hrotal Score .Differences ....(...*. 

• Item Difficulties \.j »...'. 

Differences^ the Structure of Achievement: Test A t, . .\ . 

Internal Oonsisteircy Reliability '. . . . 

- Numbers of Factors Extracted .\ ..... . 

Factor Similarity ' '. 

Differences in Achievement Level Estimates: Test B 

'' mf ota ^ Score Differences 

Tit em" Difficulties . , ' . . f. 

Differences in the Structure of Achievement: Test B «... 

Internal Consistency Reliability A*. v. 

Number of Factors Extracted » .' 

Factor Similarity . :..<, .......... 

Conclusions. ^. ^ 

, ■ Differences in Achievement Level Estimates '. 

Differences in the Structure of Achievement 



10 

id 
10 
. 10 
10 
10 

ii/ 

ii 

12 

12 

12 

13 
.13 
14 •• 
14 
14 
L4 

l f « 
14 - 

17 . 

17 

17 

17 ' 

19 

2a 

20— 

20 
20 
24 
24 

•24 ? 



I 

Discussion and Conclusions • r ?\. ...... ... '. \ 2A 

References^ * „ ^ # 26 

,Ap % pendix: Supplementary Tables*.' ' 4 28 



- % jv * . Acknowledgments ^ 

Data utflized in Study I -of this report were obtained from students enrolled 
in' General College mathematics courses at the University of * Minnesota during* 
fall quarter ^9^9. Appreciation is extended to these students and to 
Douglas Robertson, ♦^Mathematics Coordinator of* General College, for their 
participation in this* research. 

Data utilized in Study II of this report were obtained from volunteer * 
students in General Biology," Biology 1-011, at ttte University of Minnesota 
during winter quarter 1980.. - Appreciations-is extended to these students, and 
t© Kathy Swart and Norman' Kerr . of the General Biology, staff , for their' 
Participation in this research. .Gage Kingsbury' and Elana Broch were 
responsible for -the^esearch design and collection of biology data during 
Winter }9S0\ -Qf which the data reported, herein were a part. 



Techniqal Editor;. Barbara Leslie Camfc 



Dimensionality of' Measured Achievement Over Time ■ ' 

• ■ • * 

The measurement of individual or group change^ is central to many issues in 
0 the fields of psychology, education, qhd program evaluation. Psychologists, 
educators, and (more recently) evaluator^ typically use*_dif ferences in test 
scpires to quantify the effects pf experimental treatments and' educational 6 pro T 
grams on individuals and on groups of individuals. ? 

The typical paradigm for measuring change involves the administration of a 
standardized achievement tes t both before and after an experimental treatment 01 
•program implementation; the-effett of the treatment intervention is theff consid- 
ered to be a function of the mean difference betwee^Tthe two sets of 'test 
scores. *If two or mof6 .groups of students are involved, comparisons can also b< 
made between treattaent and control groups, -or among gxSCfps exposed to various 
treatments or involved in seysral different programs. Again, evaluation of 
treatment effects involves comparing the mean achievement gain (typically, a 
function of . the' difference scores) observed for &ach*group. Individual gain' or 
change is also freqifently used to measure an individual's growth in 'achievement 
level or change due to a treatment or special progr'ajn. /. 

Lord ( 1963) .and Cronbach and Furby ( 1970), among others, have discussed t;hc 
methodological and statistical problems involved In using difference scores to 1 
.measure change , or growth and have presented some possible solutions.* Whether 
measurements .of change involve the use of simple difference scores,, their deriv- 
atives, or some more complex methodological* design/ the measurement; prodess it- 
self assu^s that the treatment or instruction results in incireased'levels of 
the same "trait or characteristic "^that was meas* red 'originally and that th^ only 
change that occurs" 4s a 'quantitative one. •« ' 1 

That this assumption may be violated has long been evident in studies, of 
intelligence and intellectual growth. Ga^rett^ 1946) noted-that '/'intelligence 
dhanges in its prganization" (p. 373) and ca*led for corresponding changes in 
the way intelligence j^s measured. This "diTf erenti*ation hypothesis" spawned 
•much research ( see Re inert ,,-1570, for a review) concefning the changes in the . 
structure, and organization of intelligence throughout thfe-hutoan life span. Some 
of these studies report results supporting the hypothesis of age differentia- ' 
tion; others offer support for a hypothesis of age integration, and still others 
provide evidence in support of both these hypotheses. Nearly all this research, 
however, has found that the structure of intelligence-, as~~aefined 1>y factor 
analysis, does not remain constant with age and experience. * 

Other authors (Anastasi, 1936; Ferguson, 1*954; Games, 1962 ; . r Woodrow, 1938, 
1939a, 1939b, 1939c) have investigated the changes in verbal' ability .and intel- 
lectual factor structure that accompany shorter ^erm training .and ^practice'. 
Similar factor-analytic investfgations have been made in the areas of . psychomo-: 
tor behavior (Fleishman, 1951, 1957, 1960; Fleishman & Hempel, 1954, 1955; 
Greene, 1943), ptfychoUnguistic abilities ( Querishi, *1967) , word association 
(Sullivan & Moran, 1967; Swartz%& Jforan, 1968), and even the learning of Morse 
code (Fleishman & Fruchter, 1960). A ll these authors have found that the facto- 



rial structure of .abilities underlying task performance changes in a systematic 
way. with training and practice. An individual's status at a later point in 
time, then, may be qualitatively different from his/her status as originally 
measured. 

tfohlwili ( 1970) discusses this issue, of quantitative versus qualitative 
change more generally in the area of developmental psychology and, like Garrett 
( 1946), calls for more sophisticated .scaling methods which will 

' ... allow us to assess an individual's status on a developmental dimen- 

* sion in a manner such as to ensure not only comparability of content 
for the different parts of 'that dimension, but at the same time a con- 
tinuous scale along which developmental change can be charted .... 
j Postulating a unitary dimension across the age Span under investigation 
presupposes that there are no-major discontinuities in the development 
of the behavior irf question, such as th^re obviously'' are in the assess- 
ment of intelligence when we move from infancy to childhood, (p. 154) 

Although Reinert ( 1970) . called for the investigation of possible factor- 
structure changes in areas other than intelligence and abilities more than a 
decade ago, no research has yet extended this line of questioning into the area ' 
of classroom achievement. That is, there have been no reported studies that 
have systematically investigated whether the individual and -group changes that 
occur after classroom instruction or program participation are quantitative 
changes in the level of achievement, as is generally assumed, or whether more 
qualitative changes in the sftriicture $>f the achievement variable have occurred.. 

* 

Kingsbury and Weiss (1979) studied the effects of testing students at dif- 
ferent points in instruction. They reported' that the single factor extracted* 
from the item responses to a college general biology examination administered on 
the first day of \class and the factor extracted fropa< the item responses to a 
classroom midquarter examination differed markedly fr^m e^ch- other in terms of 
s-trength; however , *they could not further investigate, the similarity of the fac- 
tor pattern loadings from both administrations. They cautioned that, replica- 
tions of their findings contrasting the pretest factor with the later achieve- 
ment .factor wo*uld render dif fererfce scores "completely useless" as indicators *of 
achievement level growth, Since different' variables would, in fact,, be measured 
at the two paints in time* ^ * 

The* importance of 'such a. conclusion shouLd not be underestimated. If dif- 
ferent characteristics are, in fact, being Measured at two diiferent occasions, 
tnei\ the. computation of any type of difference score is inappropriate . andfthe 
evaluation of program effectiveness and gains in. individual student achievement - 
must be made on some other basis. It is justifiable to use difference scores 
(statistical and methodological issues notwithstanding) only whea it can be dem- 
onstrated that qyantitative changes are the only changes accompanying instruc- 
tion, j 

Purpose _ . « 

The objectives of the present studies 'were to investigate the nature of the 
changes in the- dimensionality of achievement that occurred following instruction 
in .two different achievement domains — basic Mathematics ancf general'biology — and 



8 



4 

- 3 - 



to determine the appropriateness of calculating difference scores 'in order to 
measure change in these domains. 



J 



STUDY I 



Me thod 



( 



/ Subjects and Tests ^ 

Data were obtained from jjtudents enrolled in mathematics classes at the 
University of Minnesota's General, College during tfhe.faU quarter of 1979. 
These students were administered a 35-item Arithmetic Placement Test (APT) on 
the first day of class (pretest) and again as a final examination (posttest). 
The .APT is composed of five-alternative multiple-choice items covering such top- 
ics as addition, subtraction, multiplication, and division of whole numbers, 
fractions, decimals, and ^percents. 

Item responses were x:oded as correct, incorrect, or missing for the 259 
. Students. However, only 136 of the students answered every item, on the APT on 
both occasions, i.e., 123 students omitted or did not reach at least one item on 
either occasion. In many cases, clusters of items were omitted in the middle of 
the tests, which implied that students were omitting the groups of items for 
which they did not know the answers, rather than reaching a time limit for the 
test. To deal with this problem of missing data, a 15%-missing-data criterion 
was employed. A student's response protocol was deleted from the data set if 
the student omitted more than five items (i.e., 15% of 35 items) on'either ; the 
pretest or the posttest. This resulted in a group of 220 students on whidh ail 
further* analyses were based. For these 220 students, missing data were coded as 
incorrect on the assumption that the student did not answer the item because 
he/she did not know the answer and was unwilling to guess. 

Analyses 

Differences in achievement level estimates . The question" of interest with 
respect to achievement level estimates was whether there were differences in 
achievement level estimates due to instruction, i.e. , were students growing or 
gaining in achievement levels throughout the dpurse ,of instruction? Analyses 
pertinent to this question included comparison/ of the frequency^ distributions 
of number-correct scores both before and after Instruction and a t test for the 
difference between the means of scores on the pretest and the posttest. Compar- 
isons were also made of the distributions of item \d±f ficulties for each adminis- 
tration of the APT. The correlation between scorfes on the pretest and posttest 
was computed as an indication of the degrefe to which the scores were linearly 
related. 



ERLC 



Differences in the structure of achievement . A related but less often in- 
vest igateFl^sue~lTl^hethe^ in the structure of item re- 
sponses due to instruction. Investigation of this issue involved computing and 
comparing the values of coefficient alphf as an index of internal consistency, 
which is related to the average level of inteVcorrelation of the items. More 
germane to this issue, however, was whether the factor structure underlying the 
test changed with instruction or whether it remained constant. Consequently, 



principal , axes factor analyses were performed separately ( on the pretest and 
posttest item responses. Pearson .product-moment correlations were computed be- 
tween pairs of item responses, and the diagonal elements of the interitem corre- 
.latipn matrices were replaced with initial estimates of the communalities of * 
each item, as^ given by the squared multiple correlation between that item and 
the other items in the matrix. An iterative procedure for improvingHEhese cpm- 
munality estimates was used, successively extractitig factors and re-estimating 
the commurialities . This process continued until the dif ference/between, two suc- 
cessive communality estimated was negligible (see Nie, Hull, Jenkins, Stein- 
brenner,' & Bent, 1975).. ' w , * 

Random sets of item responses were generated by simulating the responses *of * 
220 students to 35 items such that the probability o^^correct answer^ by any 
slmulee to an item was equal to the- difficulty (proportion correct) of'that 
item. This was done separately for the pretest and the posttest. Identical' 
procedures as performed for the re$l data were carried opt for intercorrelating 
the item responses N and factoring the resulting matrix. The results' of the fac- 
tor analyses of real and random data were compared .to determine the number of 
"nonrandom" factors existing in the real data. 

I 

The final factor solutions for the pretest and the posttest were then com- 
pared in terms of numbers of factors extracted and the similarities between/ 
them. Factor similarity was evaluated by computing the root-mean-square devia- 
tion, the product-moment correlation coef f icient , * and the cpefficient of congru- 
ence between the 'factor loadings of the factors extracted at: each test adminis- 
tration (see Harman, 19J6, pp. 343-344). These Similarity measures were com- 
pared with yalues obtained from* the two sets t>f random data, as recommended by 
Nesselroade and Baltes (1970). 



> \ ♦ ♦ Results 

Differences in Achievement Level Estimates - 



1 



Total score differences . Frequency distributions of number-correct scores 
for both administrations of the APT are presented in Appendix Table A; the fre- 
quency polygons are displayed in Figure 1. This figure ,stiows that although the 
distribution of pretest scores was approximately symmetric, the distribution of 
posttest scores was negatively skewed, indicating v the presence of a ceiling efr 
feet. Only four students answered all 35 items correctly on the postftest; an 
additional 77 students (oV 35%) incorrectly answered less than four items. The 
mean score on the pretest was 22.26, the median was 22.74, and 'the standard de- 
viation was 5.97. For the posttest these statistics were 28..91, 30.10, and 
4.88, respectively. A one-tailed t^ test for the difference between means of 
dependent groups was calculated to be 18.67, with probability < .0001. 

Item difficulties . The differences in raw score distributions observed 
between pretest and posttest were mirrored in the distributions of item dif fi x 
culties for the two administrations of the APT, as shown in Tjfce 1. Although 
the pretest items were, on the average, answered correctly nSP% often than not, 
nearly a third of them (i.e., 10 of 35) were answered incorrectly by at least 
half of ^he students. For the posttest, however, only two of the items were as 
difficult., In fact, one third of the items (12 of 35) were answered correctly 
by more than 90% of the .students. - * 1 



10 



V 



f 




Table 1 
Frequency Distributions of 
Item Difficulties for APT 
Administered as Pretest and as*Posttest 



6 



Range of Item 


Number 


of Items 


Difficulty * 


Pretest. 


Posttest 


.00 - .10 " 


• ,o 


0 


.11 - .20 


1 


0 


.21 - .30 • 


. 1 


0 


.31 - .40 


4 


' 0 


' .4*l-~ .50" 


- . 4 


2 


.51 - .60 


5 


0 


.61 - .70 


5' * 


3 


.71 - - .80 ( 


' 6 


9 


.81 - .90* 


5 


'9 


.9-1 - 1.00 


4 


12 


Mean- Difficulty 


• .64 


.83 ' 


m % 
4 


11 : 





1 



Correlation between scores . The Pearson product-moment correlation c&ef'fi- 
cient between number-correct scores at the two administrations of the APT was 
.542 • This relatively low val\ie x coupled with the evidence of mearf score in- 
creases, reveals that students did not, to a great extent, maintain their rela- 
tive standings in the course after instruction. / *■ 

Differences ,in the Structure of Achievement/ h r 

Internal consistency reliability . The internal consistency reliability of 
the APT,- as indexed >by coefficient alpha, was .836 foe the pretestr and .835 .for 
the posttest. That the reliability coefficient remained '^sentially constant 
provides some evidence for concluding that the items were functioning together 
in the same manner before and aftep instruction. However, since the variance of 
the scores decreased somewhat from pretest to posttest (see Appendix Table A), 
the stability of coefficient alpha may actually reflect a slight increase in the 
average interitem correlation. . 

Number of factors \extracted . ' The eigenvalues and percent of total variance 
accounted for by the first 15 factors from the APT and random d^ta are gtfeyen in 
Appendix Table* B. The plots of eigenvalues versus factors extracted for both 
the APT and the random data are given in Figure 2a for the pretest: and in Figure 
2b for the pQSttest. In both cases, there was one relatively strong factor in- 
the data; the eigenvalue for the first factor extracted from the* APT was much 
larger , than the eigenvalues for the remaining factors in the APT and for all the 
factors in the random data. The same cannbt be said for iny of the remaining^ 
factors. It was. concluded that a one-factor solution adequately described the 
item response data from both the pretest and the posttest. The FACTOR subrou- 
tine in SPSS (Nie et al., 1975) was then run again on the data from each admin- 
istration, specifying a single-factor solution each time. 

Factor similarity . The factor loadings on the single factor extracted from 
each administration of the APT and from corresponding random data arje given in 
Table 2. The loadings presented in Table 2 were of mbderate magnitude; the ma- 
jority of the loadings were greater 'than .300, but all were less than .700*. The 
patterns and the magnitudes of the loadings were essentially the same across • 
test administrations. For example, Items 2 through 5 &nd Item 28 were among the 
items with the lowest loadings at the pretestj the srame was Xrue for t^hese items 
at the posttest. tfhe items with the highest loadings. at the preteVtrwere also' 
ajnong the items with the highest- loadings at the posttest. -That tre magnitude 
of the loadings was similar for the two administrations can also be' seen by com- 
paring the percentage of total variance accounted for by^each factor. The sin- ' 
gle factor extracted f^om the APT pretest 'data accounted for, 13.02% of the total 
variance compared to 3.05% for the random data. The factor extracted from the 
APT posttest data was only slightly stronger, accounting for 14.593 of the total 
variance as compared to 2.^0% in the random data v . ^ 

Table 3 presents the measures of factor similarity between the APT factor 
loadings at* pretest and at posttest. The root-mean-square devJ.atibA between the 
loadings extracted at each administration is sensitive to differences .in the 
absolute levels ojE the loadings; low values, -Indicate only -minor differences be- 
tween the values of the two sets pf loadings/ The root-mean-square deviation % 
was a, low .089 foT these data. 'The product-moment correlation coefficient is 

12 • " ' * 



- 7 



Figure 2 

Eigenvalues for the First 15 Factors Extracted 
from the APT and from Corresponding Random Data 



(a) Pretest 



5 - 



ctf 
> 

60 



3 " 



2 - 



1 - 



APT 



Random 



—\ 1 — i — i 1 — ■ — h — i — i 1 — i 1 i 1 i 

1 3 5 7 9 11 13 |5 

Factor 



(b) Posttest 



6 n 



5 - 



> 

QJ 

w 



2 - 



1 " 




Random 



1 



-I 1 1 I 1 » 1 « 1 1 I 1 1 

3 5 7 9 11. P 15 * 

Factor 



13 



N 



- 8 - 



Table 2 . t . 
Factor Loadings on the Single Factor - 
Extracted from APT at Pretest and at Posttest, 
and from Carres ponding Random Data 



Item 




Pretest 


4 


Posttest 


APT 


Random Data 
. i 


.APT 


Random Data 


1 


.289 


.124 


.303 


-.042 


2 


.088 


.027 


-.004 


.130 * 


3 


- .058 


.315 


.152 


-.049 


4 


.160 


.010 


.2L9 


-.051 


. 5 


.191 


.230 


.226 


\s* :r4 ° 


6 


.263 


-.187 


.255 


.172 


7 


- .332 


-.188 


.118 


.032 


8 




.147 


.383 


.036 


9 


.156 


.099 


.341 


, .051 


10 


.384 


.150 • 


.495 


-,017 


11 


.453 


-.229 4 


. .253 


-.277 


12 . 


.372 


-.178 


.244 


-.170 


13 


-.255 


.007 


.259 


-.066 


14 


.394 


.345 


.338 


.136 


15 


.376 


.2*5 


.440 


.222 


16 ■ 


.575 


-.089 


.545 


.023 


17 


.426 


.075 


• .436- 


-.046 


18 


.562 


*?.285 _ 


.484 


.071 


19 


.491 


' -.136 


.440 


.330 


20 


.588 


.109 


.506 


.135 


21 


.580 


.029 


.676 


.025 . 


22 


.460 


.185 


V418 


.212 


23 


.344 


-.200 


.378 


.319 


24 


.370 


.402 


.433 


' .084 


» 25 


.338 


>-.028 


. .500 


.051 


26 ' ' 


.460 


.108 


.560 


.005 


27 


.357 


* -.074 


.467 


-.015 ' 


28 


.117 


.044 


.141 


.054 


29 


.495 


.042 


.481 


.044 


30 


.291 


.16^ , 


.294 


.•196 


31 


. .292 ' 


-.276 


,352 


^ .006 


32 


.378 


.018 


.386 


" .017 


33 


.318 


.084 


.281 


. .195 


34 A ' 


.313 


.090 


.359 


.128 


35 * 


.339 


.153 


.267 


-.442 


Percent of 










Trftal Variance 


43.92 


3.05 • 


14.59 


2.40 



uensitiv* only to differences in the patterns of the ldadings and was equal to 
.793. The coefficient of congruence is sensitive to differences in both th£ 
.level and* the pattern of loadings and was a high .972. High* values for these 
latter two indices indicate a high degree of similarity between the two. sets of 
factor loadings. The three figurek computed from the parallel random data were 



ERLC 



14 



• 219, ,067, and .118, respectively. It was' concluded that the factors extracted 
from each administration, of ,the APT were nearly identical, both in nature and in 
strength. ... 



* \ Table 3 

Measures of Factor Similarity Between 
, Factor Loadifigtf of APT at Pretest 

and at Posttest a,nd Between Factor Loadings 
for Corresponding Random Data 



Similarity Ind6x 


APT 


Random Data 


Root-Mean-Square- 






Deviation 


".089 


.219 


Pearson Product-Moment 






, Correlation 


.793 


.067 


Coefficient of 






•Congruence 


'.972 


.118 



Conclusions 



Differences^ in Achievement Level Estimates 



g^ei 



' There was evidence vin these data to conclude that there were gains in 'mean 
achievement levelfe observed a>fter. a course of instruction. The difference be- 
tween the means of scores on the 35-item pre'test and posttest was nearly 7 
items; the frequency cfistri-bution of iujpiber-correct scores changed from a sym- 
metric distribution to one that was negatively skewed and displaced to the 
right. This same effect was mirrored in the distributions of item difficulties. 
The correlation between the two sets of number-correct scores was .542, indicat- 
ing' that students did not generally maintain the^Lr relative standings in the 
course after instruction.^ It is not known to what extent this correlation Was 
attenuated due to the ceiling effect observed for the posttest scores. 

Differences in the Structure of Achievement^ 

Although there was definitive evidence of mean quantitative change from 
pretest to posttest, there wa^jioevidence of qualitative differences in the* 
factor structure underlying theTtenT responses. The internal consistency reli- 
ability of the test remained 'constant across administrations. When factor anal- 
yses were performed separately on the pretest and* posttest interitenr correlation 
matrices, essentially the same factor was extracted each time, as evidenced by 
the similarity in the levels and pattern of factor loadings. 

These data indicate, then, that 'students 'in the General College arithmetic 
classes were indeed leaving the course with increased levels of the same vari- 
able measured prior to instruction. The change that occurred within the quarter 
was quantitative^, not qualitative. 



15 



STUDY II 



Subjects 



.Method 



Data were collected from students enrolled in a general biology class at" 
the University of Minnesota during- winter quarter of 1980. A paper-and-pencil 
pretest was administered to all students present on the first day of class. 
Computer-administered conventional posttests were given before .classroom mid- 
qugtter and final examinations to volunteer students who were awarded extra- 
call points>for their participation. [ I 



Design 



Tests . There were two different tests administered at various times N 
throughout the quarter. Test A included 14- items from each of the three content 
/areas covered in class lectures before the midquarter^^exam (chemistry, the cell, 
and energy). Test B included 14 items from eath of the last three content areas 
in the course (genetics, reproduction/ embryology, and ecology). 



Experimental groups . The data collection design' for this stujdy is shpwn in 
Figure 3. Students were randomly assigned to two .experimental gT^u^sTSlroups ,1 
and 2, corresponding to the groups of students who were adminiacereX one gf two 
pretests^-Tests A or B, respect£vely-»-on the first day of class. Grdug^yin- 
eluded students who were absent for the first class meeting or who did not re- 
cord on their answer sheet whic^test they took. 

v; Figure 3 

Data Collection Design for Study II 



Pretest 



Group 1 



Test A: 
* Content 
Areas 1-3 



Group 2 



Test B: 
Content 
Areas 4-6 




MQ 
Posttest 



Test A 



Test A 



Test A 



Final . 
; Bxam * „ 
Posttest 



Test A 



Test B 




16 



r 



During the % two weeks immediately preceding the classroom midquarter exami- 
nation, volunteer students were administered conventional tests on the computer 
*.(MQ posttest). All these students wure administered Test A. "During the two " 
weeks immediately 0 preceding the f ina . exam, volunteer students were administered v 

'final exam posttest). Students in Group 1 
in Groups 2 and 3 were administered Test B. 



conventional tests on the computer ( 
were readministered Test A; students 



V All item responses were coded 
or omifted items did not present an 
Nevertheless, the same 15%-tnissing- 
the previous study: a student's res 
if the student omitted more than 6 ( 
the students included in the analysi 



as 



correct, incorrect, or missing. Missing 
important problem for this set of data.' 
data criterion was used here as was used in„ 
ponse protocol tfas deJLeted from the data set 
i.e., 15% of ki) items on any one test. For 
s*, all missing data were cocked -£s 'incorrect . 



Analyses 

Diffe rences in achievement levejl estimates: Test A. 



V 



The question of 

level estimates on Test A. increased from 



whether or not students 1 achievement 

the pretest* to the MQ posttest 'could be answered by examining the performance of 
Group 1. students on Test A £t both testing occasions, rfbweyer, the numbe^ of • 
students who took Test A both timesjwas small (N ■ 102) compared to the total 

s number of students who took. Test A kt the pretest only (N = 276) and. the total 
number of students who toolc Test A kt the MQ posttest only (N = 302)1 % A more 
powerful test of the difference in/mean achievement levels could be performed by 
combining the data from 'all students who ' took* Test A at-'the MQ pojsttest and by 

"comparing their performance wijzh thlt of all the students who *ook Test A as a 
pretext. , / . 

I ( 

For this comparison, it .was decessary to assume that the three groups of 
^students being combined at the' MQjposttest' were equivalent. ' Group 1 students 
were administered' Test A both at the pretest and at the MQ posttfest. (Although 
Test A was also administered agaii at the final exam posttest, the numbet of 
Group 1 students who returned to ; take Test A at the final exam posttest was too 
small for meaningful comparisons | to be .tirade. Hence, Test A analyses were con- ft 
fined to the pretest a{id MQ posttest Ministrations. ) Performance of Group 1 
students on Test A at^the'MQ posttest can b6 attributed to the students* under- 
lying ability, to the classroom instruction, and/or to the repetition of items 
from one occasion to the next. Group 2 students, on^the other hand, were admin- 
istered. Test B as the pretest and were administered Test A for the first time at 
the MQ posttest.- Performance ^of Group 2 students on Test A, then, could be at-* 
tributed only to the students 1 under-lying ability an4/or to the classroom in- 
struction. For some Group 3 students (those who were absent on the first day of 
class), performance on Test A could also be attributed to their underlying abil- 
ity and/or to the classroom instruction only. For the o£her Group 3 students 
(.those who did not record which pretest thejh took), however, Tfst A performance 
fcould be attributed to "their underlying ability, to the classroom instruction, 
and/or to item repetition. Since thei| two subgroups of Grdirp 3 students could 
not be "identified" a^d separated for analysis, however, Group 3 was omitted from 
the^ following comparison for Test* A. 



\ 



Because students were randomly assigned to Groups 1 and 2 on the fit 
of class, and because classroom instruction was the same for all student 
differences observed between Groups 1 and 2 on their performance on Test A would^ 



first da#^ 
its, arty/\ 



reflect a repfetition-of-i terns effect. If mean- test scores of Groups 1 and 2 
were not % significantly different from each other, then^Grbups 1 ark^2 could be * 
^combined at -the MQ posttest and compared with all students, from Group 1 at the 
pretest. If a significant repetition-of-items effect were found, then subse- 
quent analyses should be performed only on the data from those students in Group 
1. Differences between the scores of Group 1 and Group 2 students were evaluat- 
ed by the use of a £ test for the difference betweea two independent.' groups and 
by the Kolmogorov-Smirnov two-sample test for the difference between two fre- 
quency distributions. ^> ■ > ' • 

Analyses relevant to the issue of differences in achievement scopes includ- 
ed examination of the frequency distributions and summary statistics of i\um- 
berrcorrect scores and the distributions of item difficulties from the pretest 
and the MQ posttest. 

) Differences in the ^structure of achievement; TesjUA . -The question of 
whether or not there were qualitative changes in the *ffeture v of achievement test 
%cores due to instruction was again investigated, as in Study I, by analysis of 
internal consistency reliability coefficients and by separate principal-axes 
facto? analyses. These analyses were performed separately on the pretest and MQ 
posttest data interitem correlation matrices, with communalities estimated using 
an iterative procedure, asMescribed in Study I. The number of nonrandom fac- 
tors was again determined by comparing the results -6f the factor analyses of 
Test A data with the results of factor analyses of random data based on items of 
similar difficulty. 

♦The results of the final solutions from the pretest and the MQ posttest 
were then compared in terms of the numbers of factors extracted and the similar-, 
ity of these factors'. As in Study I, factor -similarity was indexed by the root- 
mean-square deviation, s the product-moment correlation coefficient, and the coef- 
ficient of congruence between the factor loadings obtained at each occasion in 
compari son w^th values obtained from two sets of random data. 

Differences in achievement level estimates: Test B . The question of 
whether; or not students 1 achievement level estimates on Test B increased from ' 
the pretest to the final exam posttest could be answered by examining the per- 
formance of Group 2 students cm Test,B at both testing occasions. However, if 
no significant repetition-of-items effect ^as found for Test A ( as discussed 
ab ove), the assumption could be made that ■•there would be no repetition-of— items 
effect for Test B; then there. would be justification for combining the data on 
Test B from Groups 2 and 3 at the final exam in order^to conduct^ a more powerful 
test of the difference between mean achievement level estimates. Analyses rele- 
» vant to this question included examination of ' the frequency distributions and 
summary statistics of number-correct scores, and *he distributions of item dif- 
ficulties from the pretest and -the final exam posttest. ! 

Differences in the structure of achievement: Test B . As described above, 
the internal consistency reliability coefficient (coefficient alpha) was comput- 
ed for Test B at the pretest and at the final exam posttest. .Separate principal 
axes factor analyses were also performed on the Test B data and oh parallel ran- 
' dom data. The final factor solutions of Test B from the pretest and the final 
exam posttest were also compared in terms of the number of factors extracted and 
the similarity pf these factors, as was done in Study I and for Test A in this 
study. 

' * 18 



- 13 - 



k ' Results , 

Effect of Item Repetition 

. ^The effect on achievement level estimates of repeating items from the pre- 
test to a p3stteSt was evaluated by comparing the performance of students in 
Groups 1 antt 2 on Test A -administered before the midquarter exatih(MQ posttest). 
There were ^02 students from Group 1 who volunteered to take^ttjeJMQ posttest, of 
which 98*met the 15%-missing-data criterion and were retainScTior analyses. For 
Group ^ theSe figures were- 101 and 91, respectively. 

Appendix Table C presentathe frequency distributions of number-correct 
scores for Test A administeredSit the MQ posttest -to students from Groups 1 and 
2; the frequency polygons are displayed irij Figure 4. Fbr Group 1 the mean test 
score was 24,19, the median was 23.79, and the standard deviation was 5.87. For 
Group 2 these statistics were 22.59, 21.80, and 6.26, respectively. A jt test of 
the difference between the meanS of independent groups was calculated to be 
1.98; this was not statistically significant at £ - •01. The entire frequency 
distributions of Groups 1 and 2 were compared by using a Kolmogorov-Smirnov two- 
sample test; the statistic calculated was ejqual to 7.86, which was not statisti- 
cally significant at p * .01. 

Figure 4 

Grouped Frequency Distributions of Number-Correct Scores 
for Biology Test A Administered at MQ Posttest 
\ for Groups 1 and 2 



20 "1 




10 



15 



T 

20 25 30 

Number-Correct Score 



r 

35 



40 



Although the observed differences wer£j.n the predicted direction, the ef- 
fect of item 'repetition was not statistically significant. Hence, the question 
of identifying and ^eparatitig the two^subgroups of Group 3 was no longer rele- 
vants and the Test A. MQ posttest* scores of students in Groups 1, 2, and 3 were 
• combined for comparison with the scores of all students who tOok^Test A on the 
first day of c],as§. ■ Since sfome of the students, who took the test at the pretest 



ERLC 



19 



did ^not take it at the posttest, the correlation between scores at pretest and , 
posttest was not" computed. 

Missing Data, , * , 

* There were ^76 students who were administered Test A at the pretest; of 
these 272 met the 15%-missing-data criterion and were retained for further anal- 
yses. The combined total of students who took T^t A at the MQ posttest was 
302, and' 283 of thes^ were retained for further analyses. . 

Because there was no effect of item repetition observed for" Test A, the 
performance of Group 2 students who were administered Test B at the pretest wa£ 
compared with the performance of students from both Groups 2 and 3 who were ad- 
ministered Test B at the final exam posttest. , There were 283 students who were 
administered Test B at the pretest, of which"^77 met the 15%-missing-data crite- 
rion and were retained- for fjurther analyses. A total of 169 students took Test 
B at the final exam posttest, and 163 df them were retained for further analy- 
ses. 

* 

Differences in Achievement Level Estimates: Test A 

: r ^ s — < 

Total score differences . Frequency distributions of number-correct scores 
on Test A at both testing occasions arp presented in Appendix Table D; the fre- 
quency polygons appear in Figure 5.' Both distributions are approximately sym- 
metric, with the distribution of MQ posttest scores displaced to the right. ^The 
meafi of the pretest scores was 15.97, with a standard deviation of 3.97. For 
the -MQ posttest scores, these figures were 23.46 and 5.99, respectively. The~ 1 
mean score difference between the two occasions was 7.49. Because there was 
some overlap between the students in the two groups, the groups were not strict- - 
ly independent, nor were they strictly dependent. A _t test for the difference 
.between two independent means,- although technically inappropriate, would yield a 
conservative test of the significance of this difference. This test resulted in 
t ( df = 553). * 17.34, p < .001. . " . 

Item difficulties . The frequency distributions of item difficulties for 
Test A at both testing occasions are given in Table 4. As indicated earlier, 
the pretest was somewhat difficult: 74% of the items were answered correctly by 
less than half the students'/ and no item was answered correctly more than 80% of 
the time. After instruction, more than half the items (23 of 42) were answered 
Corrects.^ by 51% to 90% of tY\e students, although* f ive items, were answered cor- 
rectly less than 30% of the time. 

Differences in the Structure of Achievement: Test A 

' < 
Internal consistency reliability . ,Coef flcient. alpha for Test A when admin- 
istered on the- first day of class was .490. Ttyis low value indicates that the 
average interitem correlation wa^ correspondingly small. After instruction, 
coefficient alpha increased to .787 for the same set of items. Although this 
value is not h'igh for a 42-item test, it represents a substantial increase^oVer 
the value obtained at the pretest. The difference between these ■ two' figures may 
indicate that the items were functioning as a set differently after instruction 
than they were before 'instruction and/or it ma/" reflect the increase in the 
^variance "of the number-correct scores. 

20 . . . 



- " Figure 5 ' • ' » 

Grouped Frequency Distributions of Number-Correct Scores 
for Biology Test A Administered at Pretest and at MQ Posttest 



f \ i \ Pretest 



MQ Pdsttest 




I ' ' ' 1 i ■ ' 1 't 
15 20 25 30 
Number-Correct^fecore. « 



v 

V 



A, 



;, Table 4* , 
Frequency Distributions of Item 
Difficulties ^or Biology Test* A 
Administered' at Pretest 
and afc MQ Postteat 



Range of Item 


Number of Items 


Difficulty 


Pretest 


Posttest 


i-C* : ; 

. .00 - .10 . . 


. + 1 ' 


1 .< ' 


.li -„:.2o . 


*8' 


1 


:2i ~ ' -.30 


. -'8 


3 . « 


.31 - .40 ' 
.41 r ^0 


9. 




5 


. > 


*.5l- - *;60 


4 




, .61 - .70 


2 


5 

*5* 


.,.-.71 - .80 


5 


".81 - .'90 • ' 


0 


8 


.91 - 1.00 ( 
Mean. Difficulty 


0 c 


' 0 


.38 


.56 • ■ 



- 16 - 



Figure 6 1 - m m „ . • 
Eigenvalues for the First 15, Factors Extracted fi^m 'Test A 
^ Administered &t Pretest and at MQ Posttest, 
and from Correspo tiding Random Data 

(a) Pretest" " 



^2.2 
1.8 

3 , * 
1 A - 

cd ' 
> 

c 

o 1,0 * 



W 



.6 
.2- 




^ r 1 | 1 1 1 — n ■ I «^ *~l ^ • 

3 5 7 ,9 41 13' 15. * 
Factor ~ 



V 



rv 



4.2 - 
3.8- 
3.4- 
3.0- 

0) 

3 2.6.- 
> 

&Z.2V 
w 

- 1.8- 
-1.4- 
1.0- 
.6- 
.2- 



(b)'MQ Pp^ttest./ • * ^ 




-I 1 — l — ^ — r 

1 3 5 



T - * - ! 1 ? 1 1 1 



7 '9 
Factor 



11 ' 13 15 



f * 



/ 



22 



- 17 - 



Number of factors extracted . * Appendix Table E presents the eigenvalues 'a^d 
percent of totfal variance accounted for by the fir.st 15 factors from Test A apd 
> from corresponding randpm data*T Figure 6a presents the plots tft eigenvalues ' 
versus factors extracted from Test A. and from' random data at th^re test," and 
« * Eigure 6b presents result^ for the MQ posttest. Comparison of the results from 

Test A with the results from the corresponding random data revealed that there 
was one weak factor present in the pretest and" one stronger- factor presentHn 3 
the posttest. 

Factdr similarity . Table 5 presents the factor loadings on tWe single fac- 
'tor extracted at each testing occasion from .Test A and from corresponding random 
* 'data. Comparison of these factor loadings reveals that the loadings from tke MQ- 
posttest were, in general, higher than those from the pretest* No loading from 
the pretest was greater than and nearly two-thirds of the factor loadings 

{26 of 42) were less'than .200. Tor the MQ posttest, the highest loading was 
.502,. but 81% of the factor loadings-(34 of*42) were greater than <200> 

This result, can also' be seen by comparing the, percentages" of total variance 
accounted for by the single factor at each administration. For thepretest that 
figure Vas 3.96% (as compared to 2.88%. for the random data); for the M(Q posttest 
, the factor* accounted for9.36% of . the total varianca(as aompared to^2.79% for 
the^ random data). Both of these percentages are small for a 42-iteiL test, indi- 
» eating that the factor wa.s relatively Weak, even at the MQ posttest. 

The pattern of factor loadings did not appear to be consistent across test 
, , / administrations. The items with the' lowest loadings at the pretest did not 

emerge as the items with the, lowest loadings at the MQ posttest, and the same 
. was true for the items with the highest .loadings . 

' • • i * 

Table 6- presents the measures of 'factor similarity between the two sets of ' 
loadings, for Test A .and the. corresponding random data. The root-mean-^quare 
deviation between the two setsjpf- loadings for Test A, sensitive* to differences 
in levels of the loadings, tf£s^YL95, a high^ value when considered in 'con junction 
with the relatively narrow range of loadings observed* in these data. Jhe prod- 
uct-moment correlation coefficient between the loadings, sensitive to pattern 
{ differences, was a low .373. ' The coefficient of congruence was .780. The simi- 
m ~ . larity measufes obtained from the random data 'were .160, .549, and .548, respec- 
tively^ All these figures reveal that the factors extracted from Test A on the 
two occasions. were ijot substantially more similar than wefe factors extracted 
from randomly generated data. 

These data reveal, then, that the factor ^extracted from Test /Tat the pre- 
test differed substantially from that extracted atf the MQ posttest. Although 
thfere was a sizeable increase in "the number-corre'fct scores after instruction, 
_ there was a corresponding change in the firfct factor underlying the item respon- 
ses. This indicates that the pretest and the MQ posttest measured quite differ- 
, ent variables, even though they were composed df exactly the same items.* 

Differences in Achievement Level Estimates: Test B 

Total score differences . Frequency^ distributions of number-correct scores 
on Test B at both testing occasions are given in Appendix Table F; their fre- 
quency polygons are presented in Figure 7. The distribution of final exam post- 
's ' 

ERIC <> , . 

23 



- 18 - 



J . Table 5 

Factor Loadings on the Single Factor 
Extracted from Biology Test A at Pretest and at MQ Posttest, 
and from Corresponding Random Data 



Pretest 



Ppsttest 



Item 


Test A 


Random Data 


Test B 


'Random Data 


1 


.Q68 


-.032 


T^T* 

.186 


.158 


2 


.024 . 


-.026 


.133 


-•205 


3 


.331 * 


' -.245 


.161 


.051 


4 


.115 


.163 


.279- 


.150 


5 


-.002 


_ ~ i 

-.238 


.276 


-.099 


6 


.2tf6 


-.054' 


.008 


.029 


. 7 


..280 


.191 


.372 


.121 


8 


.191 


-.246 


.333 


-.153 


9 


.272 


.096 


.408 


.120 


10 


.027 r 


-.005 


.367 m 


-•002 


11 » 


.291 


-.163 


.154 * 


-.154 


12 


.103 


-.035 


^207 


' .011 


13 


.370 


.327 


.502 


• 208 


14 


.391 


-.197 


.344 


-•223 




.042 


.440 " 


• 38§ 


.418 * 


16 


.273 


-.010 


.341 


• 29f v 


17 


.133 - 


-.042 


.335 


.079^ 


18 * 


.239 


-.105 


. .310 


1 - .162 




.388 > 


.021 


.276 


;i62~ 


• 20 


.205 


. .362 


A10 


.'222 


21 


. .115, 


r»059 


<» .316 


-%098 * 


22 


.223 


^-.040 


.479 • 


-.161 


23 ^ 


.383 


.060 


.298* 


.024 


24 ~ 


• 245 


.067 


.373 


-.114 


25 


.052 


-.053 


^228 


. .18 V"" 


9 26 


-.024 


-.116 


• 246 


-.105, 




.039 


.091 


.478 


.083 


28 


.015 - 


-.094 


.143 


.060 


-29 


.117 


.061 




.244 


30 ^ 


.343 


-.139 


.372 


-.224 


31 


.095 


.0.70 


.200 


..057 


32 


,19"4 


-.027 , 


.284 


* -.154 


'33 


.043 


.179 


.272 


.255 


34 ~ , 


.059' 


-.050 - ' 


.249 


. .337 


35 


.096 • 


-.150 - 


.301 


.190 


36 


-.026 


.148 


.245 


.206 


37 


.221 


-.139 


.340 » 


-.021 


38 L 

39 / 


' .107 


-.185 


.227 , 


-.095 


.106 


.282 


.241 ' 


-.016 


40 


-.111 


-.344 


-.030 


.077 


41 


-.124 


■-JL62 


.164 


-.041 


42 k 


.063 


.113 


.422 


.117 


-Percent of 










Total Variance 


3.96* 


2.88 


9.36 


2.79 



9 

ERIC 



-.19 - 



Table 6 

Measures of Factor Similarity Between Factor 
leadings for Test A at Pretest and at MQ 
Posttest, and Between Factor Loadings 
»from Corresponding Random Data 



Similarity Index 



idiryp 



Test A 



Random Data 





Roo t-Mea n-Sq uar e- ^ 
Deviation .195 






- ^160 




Plarson Product-Moment 






Correlation / .373 


,549 


Coefficient of 






Congruence .780 


.548 

r 



test scores is approximately symmetric, while that of the- pretest scores is 
slightly positively skewed. ■ The mean of the pretest scores was 15.18, with 
standard deviation 3.54. For the final exam posttest scores, these figures were 
21.47 and 4.58, respectively. The score difference between the mean scores on * 



the two occasions was 6.29. 
independent means, though t 
Native test of this difference; here, t (tif 



As before, ^a _t test for the difference between two 
independent means, though technically inappropriate, was conducted as a conser- 



438) =■ 16.15, £ < .001. 



Figure 7 

Grouped Relative Frequency Distributions of Number-Correct Scores 
for Biology Test B Administered at Pretest and at Final Exam Posttest 



>30n 



o 
c 

a 
a* 
o> 
u 

fx* 
> 



Final Exam Posttest 




" i ' 1 1 ' » ' ■ 1 : i ■ ■ ' 1 1 i \ 

10 15 20 25 '30 
Number-Correct Score 



Itdm difficulties . This frequency distributions of iten^ difficulties for 
Test B at bpth testing occasions are given in Table 7. As was observed for the 



25 



\ 



- 20 - 



\ 



number-correct scores, the pattfern of item difficulties reveals, that the pretest 
was somewhat diffifcult: 74% of the items were answered correctly by less, than 
half the students, and only two items were answered , cbrrectly more than 80% of - 
the time. At th£ end of the course, more than half the items (22 of 42) were 
answered correctly by the majority of /students , although 12 items were answered 
correctly less* than 30% of the time. 



Table 7 

Frequency Distributions of Item 
Difficulties for Biology Test B 
Administered at Pretest and 
-aV Final Exam Posttest 



- * Range of Item 
Difficulty 


Number 
Pretest 


of Items 
Posttest 


^ .00 - ,10 


. 4 


2 


.11 - .20 + 


9 


3 


.21 - .30 


8 


7 o 


.31 - .40 . 


3 


4 


.41 - .50 


7 


4 


.51 - .60 


'5 


2 


.61 - .70 * 


2 


10 


.71 - .80* 


2 


5 


.81 - .90 


2 


4 


.91 -1.00 


0 ' 


1 


Mean Difficulty 


.36 


.51 



Differences 



the Structure of Achievement: Test B 




When administered , at the pretest on the f 
first day" of class, coefficient alpha for Test B ^as .398, increasing to .630 
when administered at the final exam posttest. These low values indicate ^that 
the average interitem correlation coefficient was correspondingly small^ Even 
though both reliability coefficients were relatively ,low, the fact thatxhe re«** 
liability coefficient increased from .40 to .63* may.be an indication that the 
items were functioning as a Set differently after instruction than they were 
before instruction. As before, however, this increase may simply be reflecting 
the increase in the variance of the test scores. 



Nmnbey of factors extracted ., Appendix Table G presents the eigenvalues and , 
percentages of total variance accounted for by the first 15 factors extracted 
from Test B and from corresponding random data. Figure 8a presenftxthe plots^of 
these eigenvalues versus factors extracted at the pretest, and Figure 8b pre- 
sents similar data from the final exam posttest. Comparison of the results from 
the real data with the results;* from fthe random data reveals that^there" was~tfo 
factor stronger than one extracted from the random data in the pretest, but one 
stronger factor was extracted from' Test B at the final exam'posttest. 

J *~ 

Factor similarity . Table 8 presents" the factor loadings on the single fac-* 
tor extracted at each testing occasion from Test B**and from corresponding random 
data. t Compari*8on'of these factor loadings reveals that the loadings from the 



\ 



- 21- 



Figure 8 

Eigenvalues for the First 15 Factors Extracted from Biology Test 
Administered at\ Pretest and at Final Exam Posttest, 
and frcjm Corresponding Random Data 

* * 
(a) Pretest * 



2.4 
2.0 
1.6 




? 1.2 
c 

0) 

a - 8 



.4 - 




-I r 



1 1 rr 
7 9 
Factor 



11 



13 



15 



(b) Final Exam Posttest 



3.2 

2.8 - 

2.4 " 

oj 2.0- 

1 1.6 
c 

0) 

0C i 

gi.2H 

.8 

1 .4 
0 




3 



i 

5 



I . ' 



T 



7 9 
Factor 



11 



— I r-> 1 

13 • 15 



ERIC 



27 



I 



- 22 - 



/ * Table 8 

Factor Loadings on the Single Factor Exttocted 
frdm Biology Test 5 at Pretest and at Final Exam Postvtest 
and from Corresponding' Random Data 





* 


X A. C ICO L 




P r\ a t* f* a ♦* 

rosccesc 


It em 


Teat ft 


IvaltUULU Uata 


IcoL D 


f Kanaora Data 


1 


.131 




9Q5 


— HA A 


2 


.073 


ftft7 


71 n 

• JlU 


777 
• J// 


3 


-.023 


— 1 Aft 

• xOO 


1 Q7 


♦ Z JO 


4 


• 218 


1 99 

. 1 


A1 A 

• HID 


HQQ 
• U70 


5 


.252 


-.286 




117 
1 11 J ( 


6 


.268 


145 

• X 




1 7Q 
• 1 / 7 


7 


.191 


.145 




• Z JO 


8 ' 


.127 


- . 1 13 


.296 


9A6 


9 


-.044 


. 9Q3 


977 


s — .UDD 


10 


• 32S 




9 55 


9QA 


11 


"".193 


.471 


909 


• UOU 


12 


• .164 


.117 


311 


' - 9^Q 

. £ J7 


13 


.393 


->111 


.371 


1 61 
.101 


14 


-.007 


-. 136 


.438 


ft7ft 

tUJv 


15 


.228 


-.085 


261 


ftA ^ 


16 


.329 


-.099 


.301 

. Jul 


9ftA 


17 


.246 




71 ft 


1 Q 7 ' 
. w J 


18 


' .154 


.381 

<^ .JUL 


. ^79 


• U / J 


19 


.192 


""•098 


.241 


ftftfi 

. UU 0 


20 ' 


-.027 


.341 


.193 


• UIJ 


21 


.231 


-.151 


.307 


ftQ9 


22 


-.239 


-C156 


.268 


41 1 - 

• 411 


23 


.459 


.213 


.299 


1 69 

t • 1U£ ^ 


24 


.062 


.067 


.079 


.140 * 


25 


.009 


.182 


.330 

* JJU 


— ft^7 


26 


.045 


- . ior 


.174 


-.044 


27 


-.101 


.034 


-.112 


-.057 


28 


.130 


-.080 <j 


.043 


.119 

• llL 


29 J> 


.296 


-.245 


.084 


.ftftft 

. UO o 


30 ^ 


.215 


.077 


.155 


.328 


31 


.252 


.179 


.397 


.003 

. \J\J J 


32 


.278 


.020 


.177 

• X / / 


-.1 97 


33 


-.045 


.045 


r.'112 


-.082 


34 


• 028 


-.277 


.137 


.003 


35 


.012 


.3B4 


.165 


. .093 
V .047 


36 


.166 


Aon 


-.071 


37 


-.115 


¥-.034 


-.023 


— .026 


-38 


.018 


A*060 


-.002 


.009 


39 


.082 


.12a 


.011 


.053 


40 


.040 


.1Q9 


.178 


.-.088 


41 


.013 


-.457 . 


.105 • 


' t.015 


42 ' * , " k, * > *' 


-*058' 


.510 


-.111 


-.071 


Percent of 










Total Variance 


'3*69 


4.70 


5.96 


2.54 



ERIC v ^ 28 



final exam posttest were, in general, slightly higher; than those frm the pre- 
test. The highest pretest^ loading -was .459, and nearly two-thirds of tfce factor 
loadings (27 of 42) were less than .200. For the final exam posttest, the high- 
est loading was .438, but mote, than half of the factor loadings (23 of 42) were 
* greater, than .200. 

4 

\ J 

t This result can al.so be seen by comparing the percentage of total variance 
accounted for by the single factor extracted at each administration. For the , 
pretest, that figure was 3.69% ; (as compared to 4.70% accounted for by the random 
/actor); for the final exam posttest; the factor accounted for 5.96% of the to- 
tal variance (as compared to 2.54% , for the random data). Both of these percent- 
ages are very small, indicating that the factor was relatively weak. 

The .pattern of factor loadings did not appear consistent across test admin- 
istrations. The itiems with the lowest loadings at the pretest did not necessar- 
ily emerge as the item's with the lowest loadings at the final exam posttest, and 
the same was'true for the items with the highest loadings. 

v v • 

Table 9 presents the measures of factor similarity for Test B. The root- 
mean-square deviation between the two sets of loadings for Test B, sensitive to 
differences in levels of the loadings, was .177, a high value when considered in 
conjunction with the relatively narrow range of loadings^ observed iri~ this data 
but lower than the .300 observed for the two* sets of random data. The product- 
moment correlation coefficient between the loadings, sensitive to pattern dif- 
ferences, was a low .399 as contrasted with r_ - -.327 for the random data. The 
coefficient of congruence was .697 for Test B and' -.255 for the random d*ta\ 
Although the comparison of the , similarity measures reveals that the f actor^lpad- 
ings for Test B were more congruent than the corresponding sets of random data, 
the degree of similarity was so low that these" factors could not justifiably be 
considered congruent,. 

v . * 

Table 9 

' 4 t Measures of Factor Similarity Between Factor 

Loadings frbm Test B at Pretest and at Final 
.Exam Posttest, and Between factor Loadings 
l % from Corresponding Random Data 



Similarity Index Te-st B Random Data 
t 

Root-Mean-Square 

Deviation * .177 .300 
Pearson Pro duct -Moment 

Correlation v .399 -.327 . 

Coefficient of 1 

Congruence .'696 -.255 



These data reveal, then, that the factor extracted .fr dm Test* B at the pre- 
test dlifered from the factor extracted at posttest. As was observed for *Test 
A ); *i:hei*e was a sizeable increase in the number-correct scores, accompanied by a 
change in the factor underlying the .item responses. This indicates that the 
pretest and the final exam posttest were measuring guite'diirferent variables, 
qven though" they were composed of exactly the same s items. * . 



Conclusions 

Differences in Achievement Level Estimates ) 

The results from both Test A_and Test B indicate that there,,were mean dif- 
ferences in achievement level estimates (number-correct scores) that accompanied 
classroom instruction. On the average, test scores increased after relevant 
course instruction; for these data, scores increased between 6 and 7.5 points on 
a 42-item test. The increases t iAthese test scores were not attributable to the 
effect of item repetition. Althotfch the differences were in the predicted di- 
rection, neither a t^ test nor the Kolmegorov-Smirnov two-sample test were sig- 
nificant at p * .01. 

% ; * 

Differences in the Structure of Achievement 

There were substantial differences in the structure of item responses^to 
the items on both* biology tests — Test A and Test B — from the *pret;ett to the 
posttest. Large increases in the internal consistency reliability " coefficient 
may reflect corresponding .changes in the average interitem correlation coeffi- 
cients. That 'is, changes in the way the items functioned together as a set were 
evident after instruction took place. This same effect was observed when the 
factor structures of the tests at both administrations were* compared.' Althoug 

inly one factor was extracted at eaph administration of each test, the* factor at 
ach pretest was very weak and bore little relationship to the factor extracted 
ater in the course, as reflected in the patterns and levels of the factor load- 
ings. » 

DISCUSSION AND CONCLUSIONS 

The results of thepe studies show that the use of simple difference scores 
to measure changfe in classroom achievement may not be Appropriate for all sub- 
ject matter areas* The use of simple difference scores, or some derivative 
thereof, assumes that there is only a quantitative difference between ~pietest 
and posttest achievement levels due to a^e<^urse of ihstruction. That is, the 
assumption is made that a pretest measures a baseline amount of some knowledge 
or trait and that classroom instruction remits in increased levels of the same 
trait, as indicated by higher scores on the same, or a similar, test. 

This assumption was supported by the results of the mathematics data. 
There was a large and statistically significant difference observed in achieve- 
ment test scores obtained before and after instruction., That the same trait was 
being :measure<f both times was indicated by the. high degree of similarity of the^ 
underlying factor structure of -the test when examined at both points in time. 
The only phange observed in the mathematics test scores was, then, a quantita- 
tive^ one, reflected in increases 'in m^an number-correct score after pl^ssroom 
instruction in mathematics. 

The results were quite different for the two biology .tests examined. ' Fac- 
tor analyses of the pretests revealed the presence of one very weak factor for 
each pretest. One slightly Stronger factor also * emerged at each of the post- 
tests, but % there was very little correspondence between the pretest and posttest 

- 30 ' 



factors.* Even though mean test scores increased after instruction, there was a 
corresponding difference in the factors underlying^test performance. The change 
that occurred in the biology test scores, then, was. a qualitative one, vhere the 
tests were measuring different variables before and after instruction. Evaluat- 
ing gains in achievement by computing pretest-posttest^dif ference scores cannot 
be justified under these circumstances. ^ L 

That the results from these two studies are^if f erent has^ important bearing 
on the issue of program evaluation and the" measurement of.,change. The question 
of whether the difference in test scores that follows classroom instruction or 
program participation is quantitative or qualitative must be answered before any 
attempt at quantifying change can legitimately be made. For' some courses^of 
instruction, the application of classical chgnge-score methodology> may be de- 
fended on the grounds that the only change observed was quantitative; for .oth- 
ers, the use of such methodology may not be justified. Clearly, further^re- 
search is needed* to define thosS areas where the use of change scores or their 
derivatives may be warranted. , 



31 



- 26 - 



REFERENCES 

Anastasi, A. ~ The JLnfluence of specific experience upon mental organization. 
Genetic Psychology ponographs , L936, 18, 245-355. 

Cronbach, L. J./& Furby, L. How' should we measure "change" — or should we? Pay- 
* chological Bulletin ,' 1970, 74, 68-80. . \ 

* 

■9- 

. Ferguson, G. A. On learning and human ability. Canadian Journal of Psychology , 
1954, 8, 95-112. 

4 } 

» Fleishman, E. A. A factor analysis of intra-task performance on two psychomotor 
tasks. Psychometrika , 1953, 18, 45-55. 

Fleishman, E. A. A comparative study of aptitude patterns in unskilled and 

skilled psychomotor performances. Journal of Applied Psychology , 1957, 41, 
; 263-272. : — 

Fleishman, E. A. Abilities at different stages of practice in rotary pursuit 
performance. Journal of Experimental Psychology , 1960, 60, 162-171. 

Fleishman, E. A. , & Fruchter, B. Factor structure and predictability of succes- 
sive stages of, learning Morse Q<t&e. Journal of Applied Psych ology, 1960, 
A4, 97-101. > / 

Fleishmai(^J2. A., & Hempel, W. E., Jr. Changes in factor structure of a complex 

* psychomotor task as a function of practice-. Psyqhometrika , 1954, 19,' 
239-^52^. /fy - — t 

Fleishman^-g-* A., & Hjtmpel, W. E., Jr. The relationship between abilities and 

improvement w^tfi practice in a visual discrimination reaction task. Journal 
. of Experimental Psychology , "1955, 49 , 301-312. * 

' \ 

Game 8, P. A. A factorial analysis of verbal learning tasks. Journal of Experi- 
, mental Psychology , 14J62, 63^1-11. 

Garrett, H. E. A. developmental theory of intelligence. American Psychologist , 
1946, 1, 372-378. 

# XJreene, E. B. An analysis of random ~and systematic changes with practice. Psy- 

chometrika , 1943, 8, 37-52. - . 

Harman, H. H. Modern factor analysis ( 3 ; rd ed.). Chicago: University of Chicago 

* Press, 1976. % 

K Kingsbury, G. G., & Weiss, D. J. Effect of point-in-time in instruction on the 

measurement of achievement (Research Report 79-4). Minneapolis: University 
. of Minnesota, Department of Psychology, .Psychometric Methods Program, August 
' 1979. - . 

m 

Lord, F. M. Elementary models for measuring change. In C. W. Harris (Ed.), W 
Problems in measuring Qhange . Madison: University of Wisconsin, 1963. 

a " 

O 



• Nesselroade, J. R. , & Baltes, P. B. On a^dilemma of comparative factox^ analy- 
sis:* A study of factor ma tchittg- -based on random data. Educational and Psy- 
chological Measurement, , 1,970, 30, .935-948. ~ 

— 

Nie, N. H., Hull, C. H., Jenkins, J. G.., Steinbrenner , K. , & Bent, D. JJ. Statis- 
tical package' for the social sciences (2nd ed.)* New York: McGraw-Hill 
1975. J. • \ 

Querishi, ,M. Y. Patterns of psycholinguistic development during early and middle 
childhood.* Educational and Psychological Measurement , 1967, 27_, 353-365. 

Reinert, G. Comparative factor analytic studies of intelligence throughout the 
human life span. In L. R. Goulet & P. B. 'Baltes (Eds.), Life-span develop- 
mental psychology: Researcfy and theory . New York: AcaHemic Press, 1970. 

Sullivan, J. P. , & Mpran, L. J. Association structures of bright children at age 
six. Child Development , 1967, 38, 793-800. 

Swartz, J. D., & Moran, J. Association structures of bright children at ages 
nine and twelve. Multivariate Behavioral Research , 1968, 3^> 189-198. 

Wohlwill, J. F. Methodology and research strategy in the study' of developmental 
\ change. In L. R. Goulet & P..B,. Baltes (Eds.), Life-span 'developmental psy- 
chology: * Research and theory . New York: Academic Press, 1970. 

Woodrow, H. The relation betWeen abilities and improvement with practice. Jour- 
nal of Educational Psychology , 1938, 29, 215-230. [ 

Woodrow, H. The application of factor-analysis to problems of practice. Journal 
of- General Psychology , 1939, 21^, 457-460. (a)- * 
• \ 

Woodrow, H. Factors in improvement with practice. Journal of Psychology, 1939, 
7; 55-70. (b) • 

Woodrow, H. The relation of verbal dbility to improvement with practice in ver- 
bal tests. Journal of Educational Psychology , 1939, 30, 179-186. (c) 



1 ^ 



33 

r 



- 28 - 



-A 



Appendix: Supplementary Tables 



"' * Table A 
Frequency .Distributions of Number-Correct Scores 
for APT Pretest and Posttest (N-220) 



Pretest 



Posttest 



Cumvj 



Score Frequency Percent 



Cumulative 

Frequency Percent Percent 



35 

34. 

33 

32 

31 

30 

29 

2%(- 

27 

26 

25 

24 • 

If 

21 
20 
19 
' 18 
17 
16 
15 
14 

13 - 
12 
•11 

io 

9 
8 

MeJaq. 
,SD 

Median 
Mode 



1 

4 

7 

7 
13 

5 
13 

5 

8 
14 
20 
17 
10 11 
14 
16 

N> 
li 
n 

9 
7 
2 
4 
4 
7 
1 
0 

<l 

.22i26 
5.97 
2.1.11$ 
24 



.. '0 
'.5 
1.8 
3.2 
3.2 
5.9 
2.3 
5.9 
2.3 
3.6 
6.4 
9--1 
7.7 
4.5 

6.4 
7.3 
2.7 
5.0 
5.0 
, 4.1 
# 3.2 
Q~9 
1.8 
1.8 
3.2 
0.5 
0.0 
1.4 
0.5 



lOO.d 
100.0 
99.5 

94.5 
. 91.4 
85.5 
83.2 
77.3 
75.0 
71.4 
65. ' 
55.9 
48.2 
43.6 ' 
37.3 
§0.0 
27.3- ' 
22.3 
17.3 
13.2 
1*0.0 
' 9.1 

7.3 

5.5 

2J3 

1.8 

1.8 

0.5 



4 
20 
28 
- 29 
19 
25 
16 
19 
11 

8 

. 7 
7 
6 
1 
5 
4 
1 
3 
1 
0 
0 
1 
2 
1 
0 
1' 
0 

1 

0 

28.91 
4.88 
30.10 
32 



1.8 
9.1 

12.7 

13.2 
8.6 

11.4 
7.3 
8.6 
5.0 

.3-.6 
3.2 
3.2 
. 2.7 
0.5 
2.3 
1:8 
0.5 
1.4 
0.5 
0.0 
0.0 
0^5 
0.9 
0.5 
0.0 
0.5 
0.0 
0.5 
0.0 ' 



J 



100.0 
98.2 
" 89. 1 
76. li 
63.2 
54.5 
•43.2 
35.9 
27.3, 
22.3 
18.6 
15.5 
12.3 
9.5 
9.1 
(6.8 
5.0 
4.5 

2.7 
2.7 
2.7 
2.3 
1.4 
0.9 
0.9 
0.5 
0.5 
0.0 




Tfi^bXe B 

Eigenvalues *nd Percent of Total VariaAceC 
Accounted for by First 15 Factors Extracted frbm- the,APT 
at Pretest and at Posttest, and from Cor responding Random Data 



^Pretest 



APT. 



% Eigen- 
Factor • V^JLue 



% Total - 
Variance 



Random Data 
Eigen^ % Total ' 
Value Variance 



Posttest 



APT 



Eigen- % Total 
Value Variance 



Randoft Data, 
Eigen- % Total 
Value Variance 



1 


5.350 - 


15.3 


1.545 


4.4 


5.590 


' 16.0 


. 1.419 


.4.1 


2 


1.555 


4.4 


1.308 


3.5 


1.'605 


476 


1.253 


3.6-a 
3.3 ) 
3.2~ 


. 3 


1.539 


4.4 


1.229 


. 1.337 


J. 8 


1.161 


4 


1.209 


3.5 


1.139. 


3.3 


1.171 


3.3 


1.134 


5 


1.086 


3.1 


1.029 


2.9 


1.034 


3.0 


1^52 


3.0 


6 


1.016 


2.9 


.993 


2.8 


"1.006 


2.9 ■ 


"* 1.023 


2.9 


7 


.942 


2.7 


.890 


^2.5 


.986 


2.8 


.896 


2.6 


-8 


.892 


2.5 


.865 


• 2.5 


2-939 


2.7 


.828 


-2.4 


9 


.876 


2.5 


.822 


2.3- 


\839 


2.4_ 


* .814 


2s3 


10 


.794 


2.3 


.767 


2.2 


■ 


2.3 


.790 


2.3 


11 


~» V 739 
-*V*.666 


2.1 


.745 


. 2.1 


'.756 


2.2 


.770 


2.2 


12 


1.9 


.692 


2.0 


-.675 


1.9 


; *732 


2.1 


.13 


.607 


1.7 


.634 


1.8 


.660 


1.9- 


.702 


2.0 


14 


.597 


1.7 


••j600 


1.7 


.604 


, 1.7 


.666, 


.1.9 


.15 


.553 


1.6 


■ .566 


1.6 


- .533'° 




.600" 


1.7.. 





















J 



Table G . 
* Frequency Distribution of Number-^borrect Scores for 
Biology Test A at MQ Posfctest; for Students in ^Groups 1 and 2 







Group h ( N-98)' 






Group 2 (N-91) 




<$core 




Cumulative. 




Cumulative 


Frerfliencv 


Percent 


Percent 

* » 


- Frequency Percent 

* • 


Percent 

* 


41 


1 


1.0 


100.0 


u 


^ ft ft 


100.0 


40 


0 J 


0.0 


99.0 . 


A 
O 


ft ft 
U.U 


100.0 • 


39 


o 


0.0 ■ 


*9.0 • 


u 


ft ft 
U.U 


100.0 


38 




0.0 


99-. 0 




1 . 1 


100.0 ^ 


37, 


2 


2.0 


99.0 




* 11 
1.1 


98.9 


36 


1 


1.0 


96.9 




ft ft 
U.U 


97*8 


35 


o 


0.0 1 


95.9 




1.1 


97.8. 


34 


* 1 


1.0 . ' 


95.9 




j . J 


96.7 


33* 


2 


2.0 


94.9 


** 


• 1.1 


93.4 


32 


,3 


3.1 


/92..9 




9 9 
Z« Z 


'92.3 


31 
30 


2 


. 2.0 , 


^9.8 






90. if 


. 5 


5.1 


87.8 




« '11 
1.1 


85.7 


29 


' 6 


6.1 


82*7 


j^ 


i 

J.J 


'84.6 


28 


4 


4.1 


76.5 


i 
i 


1 i 
1.1 


81.3 


>27 


5 


5.1 ' 


72 : .4 


0 


0.0 


80.2 


26 ' 




6.1 


67% $ 


• S 
j 


J.J 


73.6 


25 


6 


6.1 4 


61.2 


c 

J 


* J.J 


68.1 


24 


7* 


J 7.1 ~ 


55'. 1 


9 
Z 


9 9 
Z . Z 


62.6 ' 


23 


10 


10.2 i . 
7.1 y 


48.0 


0 


* O . v 


60.4 


22 


7 


37.8 




j.j 


53.8 


21 


9 


9'. 2 
*3.1 U 


30.6 


o 


0.0 0 


48.4 


^ 20 


3 


21% 4 


u 


u • o 


41.8 


19 


* 5 


5.1 


18*. 4 


4 


4.4 


35.2 


1ST 




• 2.0 


. 13\*3 


9 9 


9.9 


30.8 


17 „ 




3.1 


11 0 
1 1«*Z 


5 


' 5.5 


9ft 0 i 
ZU. 7 * 


16* 




l.D . 


8.2 


5 


5.5 


15.4 


~ 15 


1 


1.0 


7.1 


. 3 


3.3 


- %09 


14 ^ 


^. 


1.0 


■ 6>1. 


1 


\ ^ 


6.6 


13 " 


1- 


1.0 


5.1 


* • 1 


' '1,1 . 


5.5 


12 


2 T~ 2.0 


4.1 


~ 1 


Kl • * 


4.4 • 




' 1 


1.0 


2.0 


• 2 . 


2.2 


3.3 


10 


0 


. 0.0 


1.0 • 


■ i 

V* 


1.1 


1.1 i 


9 


1 . 


1.0 


1*0 




v. 0 


0.0 


Mean 


24.19 






22.59 






SD 


5.87 






6.26 




* 


fledian 


23.79 






21.80 






Mode 


23 






18 







- 31 - 



O 

Table D 

Frequency Distribution of Number^Corjrect ScoreV 
for Biology Test A at Pretest and at MQ Posttes^ 



sorek 



Pretest (N-272) 



Posttest (N-283) 



Score 
.- 


Frequency 




thiTfiit 1 afivo 

vUUIUX QliVC 






Cumulative 


Percent 


Percent 


ft clJUcllL.y 


t ercen c 


Percent 


41 


0 


0.0" 


100.0 


1 

X 


n A 


i nn n 


40 


0 


0.0 


100.0 


0 


n n 
u . u 


yy .6 


39 


• 0 ( ' 


- 0.D 


100.0 


n 


u . u 


QQ r 

y y »o 


38 


.0 \ 


q/o\ 


100.0 


1 

1 


n A 


QQ C 

yy .0 


37 


.0 ' 


/o\ 


100.0 


A 


1 A 


QQ Q 
77 » J 


36 


0 


*0.0 


100.0 




n 7 


07 O 
7/ .7 


35 


0 


0.0 


100.0 




i i 
i.i 


0 7 O 
7 / *Z 


34 


0 


0.0 


100.0 


L 


1 A 
1 . *♦ 


OA 1 
70 . 1 


33 


o- \ 


0.0 


.100.0 




f i i .ft 

" 1 . o 


OA 7 
7*t . / 


32 


0 


o.d 


100.0 


A 

U 


Z?l 


QO Q 
7Z . 7 


• 31 . 


0 


0.0 


100.0 


o 

Q 

7 


^ 9 

J . £ 


on o 
y\) .o 


30 


1 


* 0\0 


ioo.o 




9 ft 
£ • o 


0/ .0 


29 


0 


0.0 


99 .6 


is 


5 ^ 


OA . 0 


28 • 


• 1 


0.4 * 


99*6 


Q ^ 

7 


^ .9 


70 ^ 
/ 7 . J 


27 


1- 


0.4 


99.3 


17 


u . u 


7 A Q 
/O.J 


26 


0 


0.0 


98.9 


X u 


S 7 
j . / * 


7n 1 
/U.j 


25 


2 , 


0;7 ~, 


98 .*9 


23 


ft 1 
» O.l 


AA 7 
0** . / 


24 


5 


1.8 ,^ 


98.2 


15 * 

X -J 


5 ^ 


JO . J 


23 


8 


2.9, 


96.3 


Ik 


ft 5 

O.J % 


^ 1 O 
J 1 . Z 


22 


6 


2.2 


93*^4 » 


15 


\ 5.^ 


A9 ft 
HZ . 0 


' 21 


8 


2.9 


91-2 # 


L9 


6.7 


^7 ^ 
J / . j 


20 


9 


fx 


88.2 * 


14 


4.9 

*T • 7 


7 


19 


25 


9.2 


84.9 


10 


3.5 


25 .ft 


18 


23 


8.5 


75.7 


16 


5.7 


99 ^ 
£Z . J 


17 




12.5 


67.3 


13 


4.6 


1 A A 
10.0 


. 16- 


23 


6.5 


54,8 


9 


3.2 


19 0 
1£ .U 


15 


. 24 


8.8 


46.3 


7 


9 5 
£ • J 


ft" O 

0.0 


14- 


" 30 


Tin 


J / • J 


c 


1 o 
1.0 


6.4 


13 


25 '* 


9.2 '* * 


26.5 * 


J 


1 1 
1.1 


. 0 


12 


13 


--'4.8 » 


17.3* <- 


• 3 


i.i 


3.5 


11 


15 ' 


5.5 


12.5 


4 


1.4 


2.5 


10 


7 


2.6 




1 


0.4 


1.1 


9 




1.8 . • 


M 


2 


• 0.7 


0.7 


8 


<,?•'■ 


* ♦ 1.1 s * , 


* 2.6* 


0 


0.0 


0.0 


j! 




i.i 


1.5 




0.0 


0.0 •' 


6 


0 


0.0 


0.4 


0.0 


0.0 


5 


0 


0.0 


0.4 ^ 




J 0.0V 


0.0 * 


4 


r 


'0.4*' 


0.4 


0 


0.0 "* 


. 0.0 


Mean 


. 15.97 






• 23.46- 






SD '« 


3.97 






5?99 






M%diari 


15.94 




* • • 


•23.35 






Mode 


17 . . 






23 







-37 



r 32 - 




Table E 

Eigenvalues and Percent of Total Variance Accounted fof by 
First 15 Factors Extracted from Biology Test A at Pret.est 
and at MQ Post test and Corresponding Random Data 



Pretest MQ Posttest 

Test A Random Data Test A " Random Data 





Eigen- 


X Total 


Eigen- 


X Total 


Eigen- 


X Total 


Eigen- 


X Total 


Factor 


value 


Variance 


value 


Variance 


value 


•Variance 


value 


Variance 


1 


2.200 


5.2 


»,1.706 


* 4.1 


4.411 


10.5 


1.572 


3.7 


2 


1.512 


3.6 


1.456 


3.5 


1.440 


3.4 . 


1.358 


3.2 


3 


. i.395 


3.3 


1.299 


3.1 


1.349 


3.2 


1.302 


3.1- 


" 4 


1.298 


3.1 


1.172 


2.8 


1.167 


2.8 


1.238 


2.9 


5 


1-.167 


. 2.8 


1.053 


2.5 


,1.026 


2.4 


1.134 


* 2.7 


* 6 


1.136 


2.7 ' 


1.044 - 


2.5 


.980 


2.3 


l.ip3 ' 


2.6 


7 


1.075 


2.6 


1.001 


2.4 


.895 


2.1 


1.017 


2.4 


8 


1.064 


2.5 


.913 


2.2 


.885 




.999 


2.4 


9 


1.004 


2.4 


.901 


2.1 


.Q44 


2.0 


.915, 


2.2 


10 


.951 


2.3 


.876 


' 2.1 


.825 


2.0 


.839 


2.0 


11 


.923 


2.2 


.845 


• 2.0 


t .784 


1.9 


.810 


1.9 


12 


.820 


2.0 


.813 


1.9 


.771 


" 1.8 


.783 


1.9 


13 


.805 


1.9 - 


.793 


1.9 


.748 


1.8 


.726 


1.7 , 


14 


.757 


1.8 


. .751 


1.8 


.696' 


1.7 


.663 


- 1.6 


15 


.726 


1.7 


.677 


1.6 


.598 


1.4 
• s 


.611 


1.5 





38 



- 33 - 



Table F 

Frequency Distribution of Number-Correct Scores 
for Biology Test B at Pretest and at Final Exam Posttest 



Pretest (N-277) 



Score Frequency Percent 



Cumulative 
Percent 



Posttest (N-163) 



Fr equency Per c en t 



Cumulative 
Percent 



s 

31 
30 
29 
28 
27 
26 
25 

/^23 
v 22 
21 
20 
19 
M8 
17 
16 
15 
14 
13 
12 
11 
10 

9 

8 

Mean 
SD 

Median 



0 
1 
0 
0 
0 

0 - 

0 

0 

1 

2 

4 

4 

6 
10 
12 
27 

31 i 

29 

30 

29 

23 

22 

21 

16 

7 

2 

15.18 
3.54 
15.12 



0.0 

0.4 

0.4 

0.4 • 

.0.4 

0.4 . 

0.4 

0.0 

0.4 

0.7 

1.4 

1.4 

2.2 

3.6 

4.3 

9.7 
11.2 
10.5 
10.8 
10.5 

8.3 

7.9 

7.6 

5.8 

2.5 

0.7 




100.0 
100.0 
100.0 
100.0 
100.0 
100.0 
100.0 
100.0 
99.6 
' 99.3 
98.6 
97.1 
95.7 
93.5 
89.9 
85.6 
75.8 
64.6 
54.2 
3.3 
32.9 
24.5 
16.6 
9.0 
3.2 
0.7 



1 


0.6 


2 


1.2 


3 


1.8 


1 


0.6 


5 


3.1 


8 


4.9 


8 


4.9 


6 


3.7 


5 


3.1 


\* 


4.9 


13 


8.0 


16 " 


9.8 


17 


10.4 


15 


9.2 


10 


/ 6.1 


12 


7.4 


10 


6.1 


10 


6.1 


5 


3.1 


3 


1.8' 


3 


1.8 


0 


0.0 


2 


1.2 


0 


o,S^ 


0 


0.0 


0 


0.0 


21.47 




4.58 




21.18 





100.0 
99.4 
98.2 
96.3 
95.7 
92.6 
87.7' 
82.8 
79.1 
76.1 
71.2 
63.2 
53.4 
42.9 
33.7 
27.6 
20.2 
14.1- 
8.0 
■ 4.9 
3.1 
1.2 
1.2 
0.0 
0.0 
0.0 



- 34 - 



T>ble G m . • • — 

Eigenvalues and Percent of Total Variance Accounted for by First 
15 Factors Extracted from Biology Test: B at Pretest and at Final Exam 
Posttest and from Corresponding Random bata 



Pretest 



Final Exam Posttest 



Test B 



Random Data 



Test B 



andom Data 





Eigen- 


X Total 


Eigen- 


X Total 


Eigen- 


X Total 


Eigen- 


X 


Total 


ictor 


• Value 


Variance 


value 


Variance 


value \ 


Variance 


value 


Variance 


1 


2.043 


4.9 , 


2>4A0 


5.8 


3.. 124 J 


7.4 


1.810' 




4.3 


2 


1.551 


3\ 7 


1.448 


3.4 


i.92<r; 


,4.6 


-1.678 




4.0 


3 


1.345 


3.2 


1.190 


2.8 


1.590 


3.8 s 


1.550 




3.7 


4 


1.204 


• 2.9 


1.146 


2.7 


* 1.480 


3.5 


,1.513 




3.6 


5 


1.152 


-2.7 


1.098 


2.7 


♦1.383 


3.3 


1.466 




3.5 


6 


1.065 


2.5 


1.053 


; . 2.5 


1.309 


3.1 


1.370 




3.3 


7 


.932 


2.2 


.999 


2.4 


1.284 


3.1 


1.305 




3.1 


8 


.-9L1 


2.2 


.929 


2.2 


1.167 


2.8' 


,1.234 




2.9 


.9 


.887 


2tt 


.920 


2.2 


1.151 • 


2.7 


1.215 




2.9 


10 


.835 




.852 


2.0 


1.059 


2.5 


1.105 




2.6 


11 


.796 


i7jx 


.770 


1.8 


.978 


2.3 


1.030 . 




2.5 


12 


.781 


l.l 


■—^739 


y 1.8 


.964 


2.3 


.966 




2.3 ' 


13 


.747 


1.8 


.702 


1.7 - 


S .927 


2.2 


.895 




2.1 


14 


.709 


1.7 


.684 


1.6 


.911 


2.2 


.857 




2.0 


15 


.685 


1.6 


.668 


1.6 


'.819 


2.0 


.803 




1.9 • 



9 

ERIC 



\ 



40 



Distribution List 



Navy 




1 Dr. Fd Aiken 

Navy Perjonne* RAD Center 
S*n Dioeo, CA f'152 

f Hcryl S* Paker 
NPRDC , 
Code P; n 9 ' » 
San Diego, CA 92152 

c 

1 Dr. Jack R. Hor sting. 
, Provost A Academic Dean 
U.fr. H.nvrl Post^Vwfuatf School 
, » , Monterey, CA 9?9*»0 

1 Chi»f of »*v*l Eduction and Training 
LUSon Office 
Air Forrr lluman Pesource Laboratory 
Flyifig Training Division 
WTLLIAM5 AFB, A7 fp2?tf 

1 ^ CDR KJkc Curran 

Office of Navnl Rccearc». 
300 M. Quiney St. w 
Code 270' * 
^Arlington, VA 772M 

1 . J)R. PAT* FEDERJCO 

"HAVY. PERSONNEL R&D CEHtf* 
K l SAN DIEC0..CA 92152- 

1 Mr, P.iul Foley 

Navy Personnel RAD Center 
Son D^ego. CA* 92162 

1 Dr. John Ford _ * 

Navy Personnel RAD Center 
San Diego, CA 92152 

\ Dr. Patrick *f^arri son 
Psychology. CSursc Director 
LEADER^H|P & LAW DEPT. (?b) 
DTV. OF PROFESSIONAL DEVELOPMENT • 
U.S. NAVAL ACADEMV 
- ANNAPOLIS, MD £lW' 

1 . Dr. Norman J. Kerr 

Chief of Naval Technical Training 
Naval Air Station Memphis (75) 
Millington, TH 38054 

Dr.. William L.* Ma^oy ' 
Principal Civilian Advisor for v 

Education qnd Training 
Naval Training Command, Code 00A 
Pcnrffccola, EL 22508 4 

CAPT Richard L. Martin, USN* 

Prospective Commanding Officer 

JUS? Carl Vinson (CVN-70) 

tewlbrt News Shipbuilding fcnri Drydock Co 

W^7por*,News, VA 

« 

Or.^SVraes McBrirto ' f 
» Navy Personnel R*n Center 
San DieRO, CA* 9215? 



Dr HUH** MonUgud 1 
M**y Personnel RAD Ontr*r 
San Dtegft CA 921*?* . 

lir. UilMen Vfcrdbrock 
Instruction**) Progrcn Development 
Dldc. °0 1 
NRT-PDCD 

Orert Lakes Navrl Trrining fen! _ 
IL 6 r PPP 



Dr. FVrnrrd Rtml^nd (M*) 
Navy Pfryonnel RAD Center 
San Di^r.o, C* ^l*" 1 



Ted M. 7. Yrllpn \ 1 
T*cbni<*nl Information Office 
NAVY PERSONNEL RAD CENtFR 
SAN DJECO, CA 9215? 

Libmry, Code P201L 
NavyBer^onnel RAD Center 
San /lego, CA *?152 

* / s , 

Technical Director 

Navy Personnel RAD j^rnter 

Sim Diego, CA °2152 

Conwianding Office^^^r 5 
Naval Rf search Laboratory 
Code 2627 

Washington, DC 20290 

Psychologist 

PNR Branch Office 

BldgMIH, Section D 

Summer^Street 
-Boston, f*AU 02210 , 

Psychologist 
ONR Branch Office 
*V36 S. Clark Street 
Chicago, IL 60605 

e 

Office of Naval Research 
Code 427 

POO N. Quincy SStreet 
Arlington, VA' 22217 

Office of Naval Research * 
Code Mf 

800 N. Cuincy Street ' 
Arlington, VA P22f7 



. 1 



Dr. Worth Se.anland, Director 

Reserrch, Develop**" n^yTrat. A tvMwtion 

K-5 7 

Nnvrl Fducr:tion ;-nd Train inr. Command 

HAT , ponrir>col<>, FL ":& n P 

Or. Robert C. Smith 
.Office of rhi^r-oT Nnv/il fpcrntions 
•«0Pi9R7H 
Washington, DC ?o:5<> * 



1 



Dr. Alfred F.^ Smode 
Training Analysis A 

(tafg) 

Dept. of the Navy 
Orlando, FL 2?P.1** 



1 



Personnel & Training Research Programs 

<eop"eJ»5S> 
Office of Naval Research 
Arlington, VA 22217 



EviJunt.ion Grot 



DrVRi chard Sarensen 
Nav*y Personnel RAO, Center 
Sm Diego, CA 921*5? . # ' 

W. Gary Tlioason 

Naval Ocecn Systems Center 

Codp 71?2 

San Diego, CA 92152 

Roger Weiss inger-Baylon 

Department of ^Administrative, Sciences 

Naval Postgraduate School' 

Monterey, CA 939*0 \ 

Dr, Ronald Weitzman 
Code 54 VZ 

Department of Administrative Sciences / 
U. S. Naval Postgraduate School. 
Itontcrey, CA 9?9 i »0 " v 

Dr. ^Robert Wisher 
Code 309 

Navy Persoithel RAD Center 
San Diego, CA 92152 . 

DR. MARTIN F. WISKOFF 
NAVY PERSONNEL R& D CENTER 
SAN DTEGO, CA • 



Psychologist 
ONR Branch Of fic/ ' 
1030 East Green Street. 
Pasadena, CA 91101 



Mr John H, Wolfe 

Code P310 7 v 
U. S. Navy Personnel Research and 
C Development Center ■ 
SaV Diego, CA 92152 



Army 



Office of the Chief of Naval Operations 
Research Development A Studies Branch 



(OP-115) 
Washington, DC ?0?50 



LT Frank C. Petho, MSC, USN (Ph.D) 
'Selection and Training Research Div 
Humui Performance Sciences Dept. 
Naval Aerospace Medical Research Lnborat 
Pensacola, FL 325(V* 



Technical Direptor 
U. S. Army Research Institute for the 

Behavioral and Social Sciences 
5001 Eisenhower Avenue 
Alexandria, VA 22333 



ERLC 



41 



Dr. Pyron F1s*hl 

U,f. Arm* R«s*".rrr Institute for the 
$oeiM rnd Pchtvioral Sciences 
5001 Eisenhower Avenu* 
Alexandria V> 2?^ 

Dr. Pichrcl Kiplrn » 
U.S. AR V Y ^EfP^PCH TJttTTTlTE 
5001 E^RHHOHFR AVEMUE 
ALFXAKDRV.. VA 22'>?" > 

Dr. PU^on 5. Y.htz 
T r*.ining Tcchnicrl ArJ' 
U.C. Amy Pas^rvh T nsti*j 
S°oV Eiserhow<»r Avenue 
Alexandria. VA 

Dr. Harold F. C # Hei 
Attn: PER T -OX 
Army Research T nsy 
5001 Eisenhower 
Alexandria, Vfr ; 

DR. JAKE? L. /A HEY 
U.S. ARMY RESEARCH flfcT T TTTE 
5001 EISENHfUFR AVCNUE 
ALEXANDRIA J VA ?23? T 



Mr. 
U.S 



Ross 

scare* ^nsfi^ute for th«» 
end Eohrvioral Sciences 
wer Avenue „ 
VA 2?^ 




Sasnor 

'Res*?rch Ir.sti^ut* for the 
viorPl and Socirl Sciences 
Eisenhower Avenu» 
Alexandria. VA 223V 

Commandant 

( Amy Institute of Administrator 
Afttn: Dr. Sherrill 
FT Benjamin Harrison. IN Hf25f 
* » 

Drl Joseph Ward 
/U.S. Army Rfsearcfr Institute 
f 5001 Eisenhower Avenue 
AlexanpYla.WA 2???; 

Air Force 



i Air fiorc* Hunan Resources Lab 
. / AFHRfc/MPD * i 
/ f Brooks AFB, TX 7?? "5 

v » 

1 U,S.**Air Force Office of Scientific 
Pesen^ch 
Life Sciences Directorate, ML 
Boiling Air Force Base 
Washington, DC 2CVJ? 

1 Or. E^rl A. Alluisi 4 
H0<, AFHRL OFSC) ' 
Brooks AFP, TX 732*5 

1* Dr. Alfred t. fregly 
AFOSP/NL. Bldg. WJJ 
n o^ng AFB : 
•/.•Sh.ingto*. DC 2033? 

1 Or. Genevieve HHdnd 
Program Jtanager 
Lif»* Sciences Di rector* 
AF03P 

Boiling AFr, DC ?033? 



Dr.vid R. Ht,nt^r 
AFHRL/MOAK 

erooVs AFP. TX ^ 9 ?;* 

F.eseirch and f**surncnt Division 
Resc.-r* Pr*ncr. AFVPC/'tPCYPR 
P«ndolph. AFP, TX .7*\*~ 

Or. f'aleolm Rec 
t FHRL/KP 

Brooks AFC. TX 7V« 

TaTV/TGH «top ?? 
Shrppnrd AFB, TX :m 1 

Dr . Jo* Ward , 
AFHRL/KPMD 
Brooks AFB, TX Y" r > 
C 

Marines 



1 Director, Pesoi,cr»- .-»nrl Dm, 
*° 'VirWPAM.)* 

^B9K t TNj phntL^on 
V"flhinR\on,/rc ? r ;c: 



1 v :!I?.*-r 4 y As>idr-"*t Ining rr< 

Personnel L'^y 
Cff«ee of Mie l*n-: r S^^st,,ry of DeV n 

Oor P-»se<rc* * Fnr.Jn*«i Ing 
Pooro "Dr ,f >. To* P^riVpcn 
W< ?*:ngtor.. OC ?*"?1 , 

1 >, Or . vsynf, "mn . * ' 

OffW Of th<* Assistant S^crv»;-y 
pf D.^r.-rr." (^PA J L) 
?F\2f? TT fc P*n>ro;on 
I'lsMngton. rc 2?'*1 

1^ OAPPA 

• / n, r n t 4/il j. on plvd> 
Arlington. V/ ?2P^9 



H. William Veenup^ 
Education Advi.*or (F. n 3l5, 
Education Center, PcrEC 
QusnMco, VA 2?M 



CtvM Govt 



director, Office of K*npower Utilizntion 
HQ. «nrir.e Corps ( W PU) 
3CP. lldg. ?009 
Ou^ntko, VA -2?t:U 



Het-dquirters. U. S. Taring Corps 
Code vpi-2 / W 
Washington /DC 20?P0 



Special AssistPn** for Paring 

Corps Matters 
Code 100M 

Office of Naval Research 
800 W. Quincy St. 
Arlington, VA 2P217 r 

Kajor Michael L. Pa*row, USMC 
Heedqu^rters, Marine Corps 
4 Code HPT-fO)' 
Washington, DC 20?80 

J>R. A.L. SUFKOSKY 
SCIENTIFIC ADVISOR (CODE RD-1 ) 
HO, U.S. MAf^NE CORPS 
WASHINGTON, DO 20380- 

CoastGuard 



Chief, Psychological Reserch Branch 
U. S. Const Guard (G-P-1/2/TPU2) 
Washington, DC 20593 

Mp. thonas A. Warm 
U. S. Cocst Guard Institute 
P. C. Substntion IP 
Oklahoma City, OK 7*U*9 



Other Dotf 



12 Defense Technical Tn formation Center 
Cameron Station, Bldg 5 
Alexandria, VA 223V* 
Attn: TC 

1 On. William Grnham _ 
Testing Directorate * ~ 

fTPCCM/MFPCT.* s~ 7 

Ft. Sheridan.' IL f00"»7 ' j 



1 . Mr. Richard fV.KUlip 
Pcrsonn^ 1 R^O Center 
Office of Personnel H^n^gen^nt 
100^ E Street fJW 
Wflshlngton. DC ?0t15 

1 Or. Androw P. Kolnar 
Science Education Dev. 

and Research 
Ma 1 - tonal Science Foundation 
Wcshinjfton. DC 2055 r 

1 Dr. H. Wallace Sineiko 
Program Director 

Manpower Research *nd Advisory Service. 
Smithsonian Institution 
P01 North Pitt Street 
Alexandria, VA 22314 

1 Dr. Vern W. Urry * 
Personnel R«> Center 
Office of Personnel Management 
1900 E Street NW 
Washington, DC 20415 

1 Dr. Joseph L. Young. Director 
Memory & Cognitive Processes 
National Science Foundation 
Washington, DC* 20550 

Non Govt 



Dr. James Algine 
University of Florida 
Gainesville, FL 32611 

Dr. Erling B. Andersen 
Department "Of Statistics 
Studiestraede * . 
1455 Copenhagen 
DENMARK 

Dr. John Annett 04 
Department of Psychology 
University of Warwick 
Coventry CVH 7AL 
ENGLAND 

1* psychological research unit > 
Dept. of Defense (Army Office) 
.Campbell Park Offices 
Canberra ACT 2600, Australia 



3 



nr.,To#»ec Bcjar 
Educational Testing Service 
Princeton, HJ CW50 

Cnpt. J. Jton Be longer 
Training Development Division 
C- nndiPn^orJos Training System 
CFTTHO. tWTTrrnton 
Astra, Ontario KOK 1B0 



Or. Menuct.r Birrnbfum 
fcrool of Education 
Tel Avtv University 
Tel Aviv, Rnmet Aviv *99?8 
Israel 



I)r, Werner Birke 

OexWPs im StreltkraeTtenrnt 

Postfac.h PC 50 03 , 

0-5 ion Bonn ? 

WFST GERMANY 

I 

Or. R. Divwel Pock 
Department of Education.* * 
University of Chlcngo 
Chicago/ IL r0f>J7 

Liaison Scientists 
Office of Naval Research, 
Branch Office , London 
Pox W FPO Hew York 09510 

Dr. Robert Brennan * 
American Coll eg r Testing Programs 
P. 0. Box 168 
I6wn City, IA 522«0 

DR. JOHN F. BROCK 

Honeywell Systems 4 Research Center 
<MN 17-231C) 
?600 Ridgeway Parkway 
Minneapolis, MN 5511? 



DR* C. VICTOR BUNDERSON 
WICAT INC* 

UNIVERSITY PLAZA; SUITE 10 
1160 SO. STATE ST. 
OR£M, UT 31057 

Dr. 'John B. Carroll 
PsVc home trie Lab 
Univ. of No. Caroline 
Davie Hall 012A ' 
Chapel Hill, KC 2751 « 



Charles Myers Library* 
Livingstone House 
Livingstone Road 
Stratford 
London E15 2LJ 




1 



Dr. H»ns Crombag 
Education Research Center 
University of Leyrfrn 
Rocrh*avel>an 2 
3231 ENLeydrn 
The NETHERLANDS 



t 



Dr. Cheater II; rris 
Fchool of Fducjt ion 
University of California 
JVmtn iWirbar*., CA \T10f 



ba^i>, 



1 



1 Dr.,Frttt Drisgow 

Yal* .School of Org^nizaMon^nnd Man;»gerie 
Yale University 
nox 1A 

*New Hav*n, CT 06520 1 

1 Mlko Duroeyer 

T nstruction&l Progrnar Development 
* • BiAlding 90 

NET-PDCD * ' 1 

Crrat Lnkes NTC, TL 600J.8 

*1 ERIC Facility-Acquisitions 
Rugby Avenue v 
Bethesda, MD P001A { 

1 Dr. Benjamin A* Fairbank, Jr. 
McFann-Croy A Associates, Inc. 
5*?5 C8llaghan « 
Suite ?25 

S^n Antonio, Jexns 78228 
• i 
1 Dr. Leonid Feldt 

Lindquist Center for Measurmcnt . 
University of Iowa 



uniy 
Tow* 



City, IA 522*12 




1 Dr. Norman Cliff 

Dept>spf Psychology 
"A Univ. of So. CeliSefhiP 
University Park 
Los An^ltfS, CA 90p07 

*1 Dr. William E. Coffman 

Director, Iowa Testing Programs 
*33* Lindquist Center 
University of .Towa 
*ow* City, IA-522^? +~ 

1 Dr. Kc-redlt^ P. CreWford 

American ^etiological Association 
1200 17th/?treet, N.W. 
W-»shingtok, -DC ?00?6 



Dr. Richr.r'd L. Ferguson 
The American Col-legr^ Testing Prograt 
P.O. Box 168 . 7* 

Iowa City, IA 52240 

Dr. Victor Fields . 
Dept. of Psychology 
Montgomery College 
Rockvilie, HD 20850 

Univ. Prof. Dr. Gerhard Fischer 
blebiggass* 5/2 * * 

A 1010 Vienna ' ' 
AUSTRIA 

Professor Donald Fitzgerald 
Universibyof New England 
Aroidale, NeVSouth Wales ?251 
AUSTRALIA \ ^ 

DR. ROBERT GLASER > 
LRDC 

UNIVERSITY t)F PITTSBURGH 
3939 O'HARA STREET 
^ PITTSBURGH, PA 15213 

Dr. Daniel Gopher 

Industrial & Management Engineering 
Technion-Israel Institute ofiechnolog/ 
Haifa 
ISRAEL 

Dr. Bert 

Johns Hopkins University 
Department of Psychology 
Charles & 34th Street *. 
Baltimore, MD 21218 

Dr* Ron Hambleton 
School of Education 
University of tiassechusetts 
Amhers^, MA 0100? 

. * 
Dr. Dettfyn Harnisch 
University of Illinois 
?M2b' Education 
Urbana, IJ, 61P01 



Dr. Dustln I*. U Mist on ^ 
J/tc^t, Inp. 
RoxOFr 

^OrcmT'UT nUOVr # 

Dr. Lloyd Hurap!,rrys - 
Department of Psychology 
t University of Hlinols 
Chumprlgn, TL 61??0 

Dr. iUcvefr Hunka 
I^epartmrnt of Education 
University of Albert^ 
Edmonton, Alberta 
. CANADA 

Dr .* Earl Hunt 
Dept. of Psychology 
University of Washington • 
Seattle, HA .9/1105 

Dr. Jack Hunter 
. ?122 Cpolidge St.. 
Lansing, MI 46906 

Dr, Huynli Huynh 
College of Education 
University of South Carolina 
Columbia, SC 29208 

Professoj^John A. Keats v 
Universl^ of Hrwcastle * 
% AUSTRALIA 2308 





Mr. Jeff Kelety 

Department of Instructional Technology 
University of Southern California 
Los Angeles, CA 92007 y 

Dr. Stephen Kosslyn 
Harvard University 
Department of Psychology 
3^ Kirkland Street 
Cambridge, MA 02138 

Dr. Marcy Lansman 
Department of Psychology, NI 25 
University of Washington 
Seattle, VA 98195 

Dr. Alan Lesgold 
Learning R&D Center 
University of Pittsburgh 
Pittsburgh, PA 15260 

Dr. Michael Lcvlne 
Department of Educational Psychology 
210 Education Bldg. 
nlverslty of Illinois 
-Champaign, IL 61801 

Dr. Charles Lewis 
Facultelt Sociale Vetenschappen 
jksunlversltelt Gronlngen 
id e Boterlngestraat 23 
12GC Gronlngen 

therlands *, 

Dr • Robert Linn 

College„of Education 

University of Illinois 

Urbana, IL 61801 , . , \ 

Dr. Frederick M. Lord ' 
Educational Testing Service 
Princeton, NJ 0»5M0 



43 



Dr. Janes Lunsden o ( 

D*p«irtment 6f Psychology 
University of Uost^rn Austral i» 
N»<! lands W.A. 6C0O 
tAlt&TRALTA 

Or. Gery Marro 

fduouMonnl Testing .Vrvine 

Princeton, HJ omo 

• 

Dr. Scott KpxwoII 
Department of Psychology 
University. of Houston 
Houston, TX 770QH 

• 

Dr. S^imuM T. Mayo , 

Loyola University of Chicago 

820 "North Michigan Avenue « 41 

Chicn&o, IL 60611 

Dr. Erik McWilliams 
Science Education Dev. and Research 
National Science Foundation 
Washington, DC 20550 

'Professor Jason* MUlman 
Department of Education 
•Ston** flail 
Cornell University 
Itheca, NY WW 

Dr. Molvin R. Novick • * 

356 Lindqulst Center for Measurment 



University of Iowa 
Towa City, IA 5?2H2 



1 Or. J^ss* OrVansky . 

Institute for Defense Analyses 
100 Army Mavy Drive 
Arlington.J^A 22202 

1 Wayne M. Patience 

American Council on Education 
GED Testing Service, Suite P.0 
One Dupont Ci*rle,.NW 
* Washington, DC 20036 

1 Dr. Jrmes A. Paulson 

Portland State University 
P.O. Box 751 
Portland, OR 97207 

1 MR. LUIGI PETRULLO 

?W N. EDCEWOOD STREET 
ARLINGTON, VA 22>C? 

1 Dr. Steven E.Poltrock 
Department of "Psycho logy 
University of Denver 
Denver.CC ?020A 



1 . DR. DIA*E M. RAfKEY-KL*E * 
R-K RESEARCH A SYSTEM DESIGN 
RIDGEHONT DRIVE 
f HALIBU, CA. 90265 
* , 
1 fflNRAT M. L. RAUCH 
P IT 1 

bundeshi'nisterium der VERTEIDIGUNG 

. POSTFACII 132ft 

D-5** BONN 1, JtJWANY v 

k 

1 «Dr„ Mnrk D. Reckase » % 

.Educational Psychology Dept. 
University af Missouri-Columbir 
> * Hill Hall 

Colunbla, HO 65*11 * 



Dr. Leonard L. tosenbaum, Chairman 
Department of Psychology 
Montgomery College 
Rockvllle, HD 20A50 

J^. Erngl fc. Rothkopf 

Pell Laboratories 

600 Mountain Avenue 
. Murray Mill, NJ 0.797$ 

Dr. Lnvrence Ru4ncr 
W Flm^Avenue^ 
Takoma Pack, MD 20*1? 

Dr. J. Ryan 

Department of Education 
University of South Carolina 
Columbia, SC 2920JT g 

PROF? FUMIKO SAMEJTMA 

DEPT. Of PSYCHOLAGY * * 

UNIVERSITY OF TENNEfSEE 

KNOXVILLE, TN 27916 
4 

- Frrnk L, Schmidt 
Department of Psychology * 
Bldg.-Gg 

George Washington University 
♦Washington, DC 20052 

Dr. Knzuo Shlgemasu 

University of Tohoku 

Department of Educational Psychology 

Kawauehi, JVndai 980 

JAPAN 

Dr. EdWlnChlrkey 
Department of Psychology 
University of Centrr.l Florida 
OrUndo, FL ?2P16 

Dr. Richard Snow 
School of Education 
Stanford University 
Stanford, CA 9*?05. 

Dr. Thomas G. Stlcht 

Director, Basic Skills Division 

HUHRRO o 

300 N. Vashfngton Street 
Alexandria, VA 22314 

DR, PATRICK SUPPES 

INSTITUTE FOR MATHEMATICAL STUDIES IM 
THE SOCIAL SCIENCES " 
' STANFORD UNIVERSITY 
STANFORD, CA 91305 

Dr. Hariharan Swamlnathan) 
Laboratory of Psychometric and 

Evaluation Research • 
School pf Education 
University of Massachusetts 
Amherst,, HA 01 00? 

Dr. Brad Sympson 
Psychometric Research Group 
Educational testing Service 
Princeton, NJ 085*1 

Dr, KlkumUTitauoka 

Computer rased Education Research 

Laboratory 
252 Engineering Resaf-reh Laboratory 
University of IlHnois 
4Jrbarm, Us 61 Ml 

Dr. David Thissen » 
Department 0/ Psychology 
University of Kansas 
Lawrenee, KS 650*1*1 



.1 Dr. Robert Tsut.1k.1w1 

Department, of Statistics 
University of f'isaouri 
Columbir, MO 65H01 

1 Dr. David Velr 
, Assessment Systems Torpor;** ion 
2"95 University Avenue . 
Suite 306 
St . PmjI , r.H 554 1 'I 

1 Dr. Jlownrc! WMner 

division of Psychological rtudiog 
Edueotionrl Testing Sirviee 
Princeton, r»j or.«wm 

I DR. SUSAN E. WM T TELY 
PSYCHOLOGY DEPARTMENT 
UNIVERSITY OF .KANSAS 
LAWRENCE, KANSAS' ftfOUU 

1 Wolfgang Wildgruhe 4 
Streitkrnr.fteamt * 
Pox 20 50 0 ? ; 
D-5:00 Bomf> / 
WEST GERMAff^"\ 



r 



1 » 



44 



V 



Previous Publications (continued) 

77-6. An Adaptive Testing Strategy for Achievement Test Batteries. October 
1977. 

77-5, CalibratiQn of an It6m Pool for the Adaptive Measurement of Achievement. 
September 197 7. 

77-4. A Rapid Item-Search Procedure for Bayesian Adaptive Testing. May 1977v 

77-3. Accuracy of Perceived Test-Item Difficulties. May 1977* 

77-2. A Comparison of Information Functions of Multiple-Choice and Free- 
Response Vocabulary Items. April 1977. 

77-1 *. t Applications of Computerized Adaptive Testing. March 1977. 

Final Report; Computerized Ability Testing, 1973-197$. Apri-rl976. *T 

76-5. Effects of Item Characteristic? on Test Fairness. December 1976. 

76-4. Psychological Effects of Immediate Knowledge of Results and Adaptive 
Ability Testing. June 1976. 

76-3. Effects of Immediate Knowledges Results and Adaptive Resting on Ability 
Test Performance. <-*me 1976. 

76-2. Effects of Time Limits on Test-Taking Behavior, April 1976. 

76-1. Some Properties ' of a Bayesian Adaptive-dflTility Testing Strategy. March 
1976. 

75-6. A Simulation Study of Stradaptive Ability Testing. December 1975. 
75-5. .Computerized Adaptive Trait Measurement:, Problems and Prospects. 
November 1975. 

75-4. A Study of Computer-Administered Stradaptive Ability "Testing. October 

1975. * » " 

75-3. Empirical and Simulation Studies of Flexilevel Ability Testing. July 

1975. i . . • • ' 

75-2. TElREST; A FORTRAN IV Program for Calculating Tetrdchoric Correlations. 

March 1975. 

75-1. An Empirical Compari&on of Two-Stage and Pyramidal Adaptive Ability 

Testing. February 1975. 
74-5. Strategies of Adaptive Ability Measurement. December 1974. 
74-4. Simulation Studies of Two-Stage Ability Testing. October 1974. 
74-3. An Empirical Inyestigation of Cotoputer-Administered Pyramidal* Ability 

Testing. >Suly 1974. 
74-2. A Word Knowledge Item Pool for Adaptive Ability Measurement. June 1974. 
74-1. A Computer Software System for Adaptive Ability Measurement. January 

1974. • y 

73-4. An Empirical Study of Computer-Administered Two-Stage^Ability Testing. 

October 1973. 

73-3. The Stratified Adaptive Computerized Ability Test. September i973. 
73-2. Comparison of Four Empirical Item Scoring Procedures. August 1973. 
;73-l. Ability Measurement; Conventional or Adaptive? February 1973. 

Copies of these reports are available, while supplies last, from: 

Computerized Adaptive Testing Laboratory * 
N660 Elliott Hall ^ 
University of Minnesota ^ - 

75 East River Road' * 
Minneapolis MN 55455 U.S.A. 



Previous Publications 



Proceedings of the 1977 Computerized Adaptive Testing Conference. 
July 1*^8. 

Research Reports 

81-4. factors Influencing the Psychometric Characteristics of an Adaptive 

Testing Strategy for Test Batteries. November 1981. 
81-3. *A Validity Comparison of Adaptive and Conventional Strategies for Mastery 

Testing. September 1981. 
Final Repoft: Computerized Adaptive Ability Testing. April 1981. 
81-2. Effects of Immediate Feedback and Pacing of Item Presentation on Ability 

Test Performance and Psychological Reactions to Testing. February 

1981. 

81-1. Review of Test Theory and Methods. January 1981. 

80-5. An Alternate-Forms Reliability and Concurrent Validity Comparison of 
Bayesian Adaptive and Conventionfl' Ability Tests. December 1980. 

80-4. A Comparison of Adaptive, Sequential, and Conventional Testing/Strategies 
for Mastery Decisions. November 1980.> 

80-3. Criterion-Related Validity of Adaptive Testing Strategies. June 1980. 

80-2. Interactive Computer Administration of a ^Spatial Reasoning Test. April 
1980. 

FinaTRepdrt: Computerized Adaptive Performance Evaluation. February 

Hf980. . 
80-1. Effects of Immediate Knowledge of Results on Achievement Test Performance 

and Test Dimensionality;- .-January 1980. 
79^7. The Person Response Curve: tfit of Individuals to Item Characteristic 

Curve Models. December 1979. 
79-6. Efficiency of an Adaptive Int^r-Subtest Branching Strategy in the 

Measurement of Classroom Achievement. November 1979. 
79-5. An Adaptive- Testing Strategy for Mastery^Decisions. September 1979. 
79-4. Effect of Point^in-Time in ^pstriiction on the Measurement of Achievement. 

August 1979. 

'79-*3. Relationships among Achievement Level Estimates from Three Item 

Characteristic Curve Scoriijg Methods. April 1979. 
Final Report: Bias-Free Computerized Testing. March 1979. 
79-2. Effects of Computerized Adaptive testing on Black and White Students. 

March 1979. 

79-1. Computer Programs for Scoring Test Data with Item Characteristic Curve 

Models. February 1979. 
78-5. An Item Bias Investigation of a 'Standardized Aptitude Test. December 

1978. \ 
78-4. A Construct Validation of Adaptive Achievement: Testing.^ November 1978. 
78-3. A Comparison of Levels and Dimensions ttf Performance in Black and White 

Groups on Tests of Vocabulary, Mathematics, and Spatial Ability 1 
v October 1978. ; . 

78-2. The Effects of Knowledge cff*|tesults and t Test Difficulty on Ability Test"* 
Performance and Psychological .Reactions to Testing. September 1978/ 

78-1. A Comparison of the Fairness of Adaptive and Conventional Testing 
Strategies. August 1978. f 

77-7. An Information Comparison of Conventional' and Adaptive Tests in the 
Measurement of Classroofo Achievement. October 1977. 

-continued overleaf- 



4ft 



